21,306 research outputs found
Size, power and false discovery rates
Modern scientific technology has provided a new class of large-scale
simultaneous inference problems, with thousands of hypothesis tests to consider
at the same time. Microarrays epitomize this type of technology, but similar
situations arise in proteomics, spectroscopy, imaging, and social science
surveys. This paper uses false discovery rate methods to carry out both size
and power calculations on large-scale problems. A simple empirical Bayes
approach allows the false discovery rate (fdr) analysis to proceed with a
minimum of frequentist or Bayesian modeling assumptions. Closed-form accuracy
formulas are derived for estimated false discovery rates, and used to compare
different methodologies: local or tail-area fdr's, theoretical, permutation, or
empirical null hypothesis estimates. Two microarray data sets as well as
simulations are used to evaluate the methodology, the power diagnostics showing
why nonnull cases might easily fail to appear on a list of ``significant''
discoveries.Comment: Published in at http://dx.doi.org/10.1214/009053606000001460 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Rejoinder: Microarrays, Empirical Bayes and the Two-Groups Model
Rejoinder to ``Microarrays, Empirical Bayes and the Two-Groups Model''
[arXiv:0808.0572]Comment: Published in at http://dx.doi.org/10.1214/08-STS236REJ the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Microarrays, Empirical Bayes and the Two-Groups Model
The classic frequentist theory of hypothesis testing developed by Neyman,
Pearson and Fisher has a claim to being the twentieth century's most
influential piece of applied mathematics. Something new is happening in the
twenty-first century: high-throughput devices, such as microarrays, routinely
require simultaneous hypothesis tests for thousands of individual cases, not at
all what the classical theory had in mind. In these situations empirical Bayes
information begins to force itself upon frequentists and Bayesians alike. The
two-groups model is a simple Bayesian construction that facilitates empirical
Bayes analysis. This article concerns the interplay of Bayesian and frequentist
ideas in the two-groups setting, with particular attention focused on Benjamini
and Hochberg's False Discovery Rate method. Topics include the choice and
meaning of the null hypothesis in large-scale testing situations, power
considerations, the limitations of permutation methods, significance testing
for groups of cases (such as pathways in microarray studies), correlation
effects, multiple confidence intervals and Bayesian competitors to the
two-groups model.Comment: This paper commented in: [arXiv:0808.0582], [arXiv:0808.0593],
[arXiv:0808.0597], [arXiv:0808.0599]. Rejoinder in [arXiv:0808.0603].
Published in at http://dx.doi.org/10.1214/07-STS236 the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Are a set of microarrays independent of each other?
Having observed an matrix whose rows are possibly correlated,
we wish to test the hypothesis that the columns are independent of each other.
Our motivation comes from microarray studies, where the rows of record
expression levels for different genes, often highly correlated, while the
columns represent individual microarrays, presumably obtained
independently. The presumption of independence underlies all the familiar
permutation, cross-validation and bootstrap methods for microarray analysis, so
it is important to know when independence fails. We develop nonparametric and
normal-theory testing methods. The row and column correlations of interact
with each other in a way that complicates test procedures, essentially by
reducing the accuracy of the relevant estimators.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS236 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Rejoinder: The Future of Indirect Evidence
Rejoinder to "The Future of Indirect Evidence" [arXiv:1012.1161]Comment: Published in at http://dx.doi.org/10.1214/10-STS308REJ the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Multivariate limited translation hierarchical Bayes estimators
Based on the notion of predictive influence functions, the paper develops multivariate limited translation hierarchical Bayes estimators of the normal mean vector which serve as a compromise between the hierarchical Bayes and maximum likelihood estimators. The paper demonstrates the superiority of the limited translation estimators over the usual hierarchical Bayes estimators in terms of the frequentist risks when the true parameter to be estimated departs widely from the grand average of all the parameters
Correcting the Minimization Bias in Searches for Small Signals
We discuss a method for correcting the bias in the limits for small signals
if those limits were found based on cuts that were chosen by minimizing a
criterion such as sensitivity. Such a bias is commonly present when a
"minimization" and an "evaluation" are done at the same time. We propose to use
a variant of the bootstrap to adjust the limits. A Monte Carlo study shows that
these new limits have correct coverage.Comment: 14 pages, 5 figue
Pathwise Least Angle Regression and a Significance Test for the Elastic Net
Least angle regression (LARS) by Efron et al. (2004) is a novel method for
constructing the piece-wise linear path of Lasso solutions. For several years,
it remained also as the de facto method for computing the Lasso solution before
more sophisticated optimization algorithms preceded it. LARS method has
recently again increased its popularity due to its ability to find the values
of the penalty parameters, called knots, at which a new parameter enters the
active set of non-zero coefficients. Significance test for the Lasso by
Lockhart et al. (2014), for example, requires solving the knots via the LARS
algorithm. Elastic net (EN), on the other hand, is a highly popular extension
of Lasso that uses a linear combination of Lasso and ridge regression
penalties. In this paper, we propose a new novel algorithm, called pathwise
(PW-)LARS-EN, that is able to compute the EN knots over a grid of EN tuning
parameter {\alpha} values. The developed PW-LARS-EN algorithm decreases the EN
tuning parameter and exploits the previously found knot values and the original
LARS algorithm. A covariance test statistic for the Lasso is then generalized
to the EN for testing the significance of the predictors. Our simulation
studies validate the fact that the test statistic has an asymptotic Exp(1)
distribution.Comment: 5 pages, 25th European Signal Processing Conference (EUSIPCO 2017
- …