12,621 research outputs found
Empirical Bayes estimation of posterior probabilities of enrichment
To interpret differentially expressed genes or other discovered features,
researchers conduct hypothesis tests to determine which biological categories
such as those of the Gene Ontology (GO) are enriched in the sense of having
differential representation among the discovered features. We study application
of better estimators of the local false discovery rate (LFDR), a probability
that the biological category has equivalent representation among the
preselected features.
We identified three promising estimators of the LFDR for detecting
differential representation: a semiparametric estimator (SPE), a normalized
maximum likelihood estimator (NMLE), and a maximum likelihood estimator (MLE).
We found that the MLE performs at least as well as the SPE for on the order of
100 of GO categories even when the ideal number of components in its underlying
mixture model is unknown. However, the MLE is unreliable when the number of GO
categories is small compared to the number of PMM components. Thus, if the
number of categories is on the order of 10, the SPE is a more reliable LFDR
estimator. The NMLE depends not only on the data but also on a specified value
of the prior probability of differential representation. It is therefore an
appropriate LFDR estimator only when the number of GO categories is too small
for application of the other methods.
For enrichment detection, we recommend estimating the LFDR by the MLE given
at least a medium number (~100) of GO categories, by the SPE given a small
number of GO categories (~10), and by the NMLE given a very small number (~1)
of GO categories.Comment: exhaustive revision of Zhenyu Yang and David R. Bickel, "Minimum
Description Length Measures of Evidence for Enrichment" (December 2010).
COBRA Preprint Series. Article 76. http://biostats.bepress.com/cobra/ps/art7
Multiple testing procedures under confounding
While multiple testing procedures have been the focus of much statistical
research, an important facet of the problem is how to deal with possible
confounding. Procedures have been developed by authors in genetics and
statistics. In this chapter, we relate these proposals. We propose two new
multiple testing approaches within this framework. The first combines
sensitivity analysis methods with false discovery rate estimation procedures.
The second involves construction of shrinkage estimators that utilize the
mixture model for multiple testing. The procedures are illustrated with
applications to a gene expression profiling experiment in prostate cancer.Comment: Published in at http://dx.doi.org/10.1214/193940307000000176 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Optimal classifier selection and negative bias in error rate estimation: An empirical study on high-dimensional prediction
In biometric practice, researchers often apply a large number of different methods in a "trial-and-error" strategy to get as much as possible out of their data and, due to publication pressure or pressure from the consulting customer, present only the most favorable results. This strategy may induce a substantial optimistic bias in prediction error estimation, which is quantitatively assessed in the present manuscript. The focus of our work is on class prediction based on high-dimensional data (e.g. microarray data), since such analyses are particularly exposed to this kind of bias.
In our study we consider a total of 124 variants of classifiers (possibly including variable selection or tuning steps) within a cross-validation evaluation scheme. The classifiers are applied to original and modified real microarray data sets, some of which are obtained by randomly permuting the class labels to mimic non-informative predictors while preserving their correlation structure. We then assess the minimal misclassification rate over the different variants of classifiers in order to quantify the bias arising when the optimal classifier is selected a posteriori in a data-driven manner. The bias resulting from the parameter tuning (including gene selection parameters as a special case) and the bias resulting from the choice of the classification method are examined both separately and jointly.
We conclude that the strategy to present only the optimal result is not acceptable, and suggest alternative approaches for properly reporting classification accuracy
A statistical framework for the design of microarray experiments and effective detection of differential gene expression
Four reasons why you might wish to read this paper: 1. We have devised a new
statistical T test to determine differentially expressed genes (DEG) in the
context of microarray experiments. This statistical test adds a new member to
the traditional T-test family. 2. An exact formula for calculating the
detection power of this T test is presented, which can also be fairly easily
modified to cover the traditional T tests. 3. We have presented an accurate yet
computationally very simple method to estimate the fraction of non-DEGs in a
set of genes being tested. This method is superior to an existing one which is
computationally much involved. 4. We approach the multiple testing problem from
a fresh angle, and discuss its relation to the classical Bonferroni procedure
and to the FDR (false discovery rate) approach. This is most useful in the
analysis of microarray data, where typically several thousands of genes are
being tested simultaneously.Comment: 9 pages, 1 table; to appear in Bioinformatic
Microarrays, Empirical Bayes and the Two-Groups Model
The classic frequentist theory of hypothesis testing developed by Neyman,
Pearson and Fisher has a claim to being the twentieth century's most
influential piece of applied mathematics. Something new is happening in the
twenty-first century: high-throughput devices, such as microarrays, routinely
require simultaneous hypothesis tests for thousands of individual cases, not at
all what the classical theory had in mind. In these situations empirical Bayes
information begins to force itself upon frequentists and Bayesians alike. The
two-groups model is a simple Bayesian construction that facilitates empirical
Bayes analysis. This article concerns the interplay of Bayesian and frequentist
ideas in the two-groups setting, with particular attention focused on Benjamini
and Hochberg's False Discovery Rate method. Topics include the choice and
meaning of the null hypothesis in large-scale testing situations, power
considerations, the limitations of permutation methods, significance testing
for groups of cases (such as pathways in microarray studies), correlation
effects, multiple confidence intervals and Bayesian competitors to the
two-groups model.Comment: This paper commented in: [arXiv:0808.0582], [arXiv:0808.0593],
[arXiv:0808.0597], [arXiv:0808.0599]. Rejoinder in [arXiv:0808.0603].
Published in at http://dx.doi.org/10.1214/07-STS236 the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
An adaptive significance threshold criterion for massive multiple hypotheses testing
This research deals with massive multiple hypothesis testing. First regarding
multiple tests as an estimation problem under a proper population model, an
error measurement called Erroneous Rejection Ratio (ERR) is introduced and
related to the False Discovery Rate (FDR). ERR is an error measurement similar
in spirit to FDR, and it greatly simplifies the analytical study of error
properties of multiple test procedures. Next an improved estimator of the
proportion of true null hypotheses and a data adaptive significance threshold
criterion are developed. Some asymptotic error properties of the significant
threshold criterion is established in terms of ERR under distributional
assumptions widely satisfied in recent applications. A simulation study
provides clear evidence that the proposed estimator of the proportion of true
null hypotheses outperforms the existing estimators of this important parameter
in massive multiple tests. Both analytical and simulation studies indicate that
the proposed significance threshold criterion can provide a reasonable balance
between the amounts of false positive and false negative errors, thereby
complementing and extending the various FDR control procedures. S-plus/R code
is available from the author upon request.Comment: Published at http://dx.doi.org/10.1214/074921706000000392 in the IMS
Lecture Notes--Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
Size, power and false discovery rates
Modern scientific technology has provided a new class of large-scale
simultaneous inference problems, with thousands of hypothesis tests to consider
at the same time. Microarrays epitomize this type of technology, but similar
situations arise in proteomics, spectroscopy, imaging, and social science
surveys. This paper uses false discovery rate methods to carry out both size
and power calculations on large-scale problems. A simple empirical Bayes
approach allows the false discovery rate (fdr) analysis to proceed with a
minimum of frequentist or Bayesian modeling assumptions. Closed-form accuracy
formulas are derived for estimated false discovery rates, and used to compare
different methodologies: local or tail-area fdr's, theoretical, permutation, or
empirical null hypothesis estimates. Two microarray data sets as well as
simulations are used to evaluate the methodology, the power diagnostics showing
why nonnull cases might easily fail to appear on a list of ``significant''
discoveries.Comment: Published in at http://dx.doi.org/10.1214/009053606000001460 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …