The statistical power of replications in difference tests

Abstract

It has been argued that the binomial test with N=nk observations is a valid test when n assessors each perform k replicated difference tests, for instance triangular tests, see Various models and approaches to account for the replications have been suggested in the sensory literature. Ennis and Bi (1998) (and other papers by these authors) recommends the beta-binomial model, All these different approaches are compared theoretically and by applications on real data. A striking result is the similarity between beta-binomial models and generalized linear models: The beta-binomial model assumes that the true individual correct answer probabilities follow a beta-distribution. For the generalized linear model the corresponding density is deduced, and despite the apparent difference between the mathematical formulae, plots of the densities show that there is hardly any difference at all between the models induced by the two approaches, see It is shown how the statistical power of the binomial test can easily be computed for the various approaches using Monte Carlo methods and standard software. These power calculations show little difference between the three main approaches: beta-binomial models, generalized linear models and binomial mixture models. They also together with the theoretical comparison show how the simple extreme version of the binomial mixture model can be seen as the common extreme case for all three approaches. This common extreme case scenario corresponds to the situation where each individual is assumed to be either a discriminator (having probability one of correct answer) or a non-discriminator (having probability c of correct answer). Although this is not the proper description of the data generating process it does provide a lower limit of power for a given combination of n and k. Tables of these limit of power is provided for combinations of n=5,..,50 and k=1,..,5. It is shown how this lower limit is high enough to be of practical importance. For instance with n=12 assessors and k=4 replications for each assessor the power of the 0.05-level binomial test with N=48 for an effect size of 25% above chance is 77%. For the extreme case the (lower limit) power is 69%, hence only a moderate loss of power is seen. The power o

    Similar works

    Full text

    thumbnail-image

    Available Versions