How to decide whether small samples comply with an equidistribution

Abstract

Abstract The decision whether a measured distribution complies with an equidistribution is a central element of many biostatistical methods. High throughput differential expression measurements, for instance, necessitate to judge possible over-representation of genes. The reliability of this judgement, however, is strongly affected when rarely expressed genes are pooled. We propose a method that can be applied to frequency ranked distributions and that yields a simple but efficient criterion to assess the hypothesis of equiprobable expression levels. By applying our technique to surrogate data we exemplify how the decision criterion can differentiate between a true equidistribution and a triangular distribution. The distinction succeeds even for small sample sizes where standard tests of significance (e.g. χ 2 ) fail. Our method will have a major impact on several problems of computational biology where rare events baffle a reliable assessment of frequency distributions. The program package is available upon request from the authors

    Similar works