When the same set of genes appear in two top ranking gene lists in two
different studies, it is often of interest to estimate the probability for this
being a chance event. This overlapping probability is well known to follow the
hypergeometric distribution. Usually, the lengths of top-ranking gene lists are
assumed to be fixed, by using a pre-set criterion on, e.g., p-value for the
t-test. We investigate how overlapping probability changes with the gene
selection criterion, or simply, with the length of the top-ranking gene lists.
It is concluded that overlapping probability is indeed a function of the gene
list length, and its statistical significance should be quoted in the context
of gene selection criterion.Comment: submitted to IEEE/EMBS Conference'0