Many biomedical experiments are carried out by pooling individual biological
samples. However, pooling samples can potentially hide biological variance and
give false confidence concerning the data significance. In the context of
microarray experiments for detecting differentially expressed genes, recent
publications have addressed the problem of the efficiency of sample-pooling,
and some approximate formulas were provided for the power and sample size
calculations. It is desirable to have exact formulas for these calculations and
have the approximate results checked against the exact ones. We show that the
difference between the approximate and exact results can be large. In this
study, we have characterized quantitatively the effect of pooling samples on
the efficiency of microarray experiments for the detection of differential gene
expression between two classes. We present exact formulas for calculating the
power of microarray experimental designs involving sample pooling and technical
replications. The formulas can be used to determine the total numbers of arrays
and biological subjects required in an experiment to achieve the desired power
at a given significance level. The conditions under which pooled design becomes
preferable to non-pooled design can then be derived given the unit cost
associated with a microarray and that with a biological subject. This paper
thus serves to provide guidance on sample pooling and cost effectiveness. The
formulation in this paper is outlined in the context of performing microarray
comparative studies, but its applicability is not limited to microarray
experiments. It is also applicable to a wide range of biomedical comparative
studies where sample pooling may be involved.Comment: 8 pages, 1 figure, 2 tables; to appear in Bioinformatic