 # Sensory evaluation with the triangle test

A couple weeks back I advised the folks over at Brulosophy to switch to an upper tailed binomial proportions test for determining significant results in their exbeeriments. I also created a p-value calculator for them, which you can now use for your own exbeeriments!

In case you were wondering, the statistical analysis associated with the triangle test compares the proportion of test participants whom have correctly identified an odd beer out, to the proportion of tasters that would be expected to correctly identify the odd beer purely due to random chance. The greater the proportion of correct participants, the more evidence there is against the “random chance” null hypothesis. As we are only interested if the different beers can be correctly distinguished, this will be an “upper tailed” test. That is to say, the potential result of the odd beer out being correctly identified less than we’d expect under random chance isn’t of particular interest to us (and is not particularly likely either).

In plain terms, our null hypothesis is that the true proportion of population that can correctly identify the odd beer out is 1/3. Our alternative hypothesis is that the true proportion of the population that can correctly identify the odd beer out is greater than 1/3.

The exact p-value can be calculated using the binomial distribution. Specifically, the p-value is found as the probability of having observed at least as many correct tasters, if the population proportion is infact equal to 1/3.

An approximate p-value can be calculated assuming that under the null hypothesis, the estimated proportion will follow a normal distribution with mean equal to 1/3. The approximate method may be used when the sample size is equal or greater than 25. ## 10 comments for “Sensory evaluation with the triangle test”

1. September 17, 2015 at 10:51 am

Thank you so much for your help!!

2. JC Carter
September 17, 2015 at 6:39 pm

I assume that the reading the null hypothesis for the true proportion of the population to identify the odd beer at 1/3 was set at that value is because its a three way comparison of two identical and one different?

• Justin
September 17, 2015 at 8:03 pm

Yes, exactly. If making a random selection, a taster would have a 1/3 chance of correctly identifying the odd beer out due to the fact that three samples are presented.

3. Aaron
March 19, 2017 at 11:55 am

Justin, are there any rules about the minimum sample size or anything else to be aware of?

Also, what can be said about a p value of 0.04 vs a p value of 0.02, for example. I’ve always heard that significance is significance, there is no scale, but we already chose 0.05 as the cutoff, so it seems odd to say that there is no difference between 0.02 and 0.04.

• March 19, 2017 at 12:18 pm

The rules about minimum sample size only apply when using the normal approximation.

We select our significance level prior to conducting our statistical test – as you know if the p-value is less than the significance level, we reject the null hypothesis in favour of the alternative. While a smaller p-value indicates more evidence against the null hypothesis, this p-value is only used to make the binary conclusion of either rejecting or failing to reject the null hypothesis, so in effect, all that we are really taking from it is whether it is over that threshold of significance or not.

Where it may matter just how many participants were correct is in deriving an estimate of what the true population proportion is. While 5/7 and 6/7 would both be significant results in the triangle test, the latter leads to a “more extreme” estimate of the true population proportion.

• Aaron
March 19, 2017 at 12:27 pm

I think that makes sense. Thanks.

• Blake
February 20, 2020 at 10:05 am

Hi Justin,
I’m wondering about effect sizes calculations for these statistics. I often find that many Brulosophy studies find a significant
difference from chance by only a few (sometimes 1) tasters. This seems like an appropriate way to add important nuance to
what they’re trying to do. I’m interested in your thoughts on this — I have not looked at the math real close yet. Thanks!

• Justin
February 20, 2020 at 10:45 am

The estimate of effect size would be max(0, x/n – 1/3), where x is the number of correct participants, n is the total number of participants. I include the max bit as a negative effect size isn’t really interpretable with this test.

To reject the null hypothesis, all we need is the confidence interval for the effect size to not include 0. The lower bound of that CI may be barely above 0 in some cases (e.g. experiments in which we just meet the threshold for significance).

Brulosophy has considered providing results like this – like CIs for effect size, but it’s less approachable for most people compared to something like “more people picked the odd beer out than would be expected if everyone selected at random” – so it’s left at that.