6,824 research outputs found
Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism
Testing for the significance of a subset of regression coefficients in a
linear model, a staple of statistical analysis, goes back at least to the work
of Fisher who introduced the analysis of variance (ANOVA). We study this
problem under the assumption that the coefficient vector is sparse, a common
situation in modern high-dimensional settings. Suppose we have covariates
and that under the alternative, the response only depends upon the order of
of those, . Under moderate sparsity levels, that
is, , we show that ANOVA is essentially optimal under some
conditions on the design. This is no longer the case under strong sparsity
constraints, that is, . In such settings, a multiple comparison
procedure is often preferred and we establish its optimality when
. However, these two very popular methods are suboptimal, and
sometimes powerless, under moderately strong sparsity where .
We suggest a method based on the higher criticism that is powerful in the whole
range . This optimality property is true for a variety of designs,
including the classical (balanced) multi-way designs and more modern ""
designs arising in genetics and signal processing. In addition to the standard
fixed effects model, we establish similar results for a random effects model
where the nonzero coefficients of the regression vector are normally
distributed.Comment: Published in at http://dx.doi.org/10.1214/11-AOS910 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal
Unveiling meaningful geophysical information from seismic data requires to
deal with both random and structured "noises". As their amplitude may be
greater than signals of interest (primaries), additional prior information is
especially important in performing efficient signal separation. We address here
the problem of multiple reflections, caused by wave-field bouncing between
layers. Since only approximate models of these phenomena are available, we
propose a flexible framework for time-varying adaptive filtering of seismic
signals, using sparse representations, based on inaccurate templates. We recast
the joint estimation of adaptive filters and primaries in a new convex
variational formulation. This approach allows us to incorporate plausible
knowledge about noise statistics, data sparsity and slow filter variation in
parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a
constrained minimization problem that alleviates standard regularization issues
in finding hyperparameters. The approach demonstrates significantly good
performance in low signal-to-noise ratio conditions, both for simulated and
real field seismic data
High-dimensional change-point detection with sparse alternatives
We consider the problem of detecting a change in mean in a sequence of
Gaussian vectors. Under the alternative hypothesis, the change occurs only in
some subset of the components of the vector. We propose a test of the presence
of a change-point that is adaptive to the number of changing components. Under
the assumption that the vector dimension tends to infinity and the length of
the sequence grows slower than the dimension of the signal, we obtain the
detection boundary for this problem and prove its rate-optimality
Local Regularization Assisted Orthogonal Least Squares Regression
A locally regularized orthogonal least squares (LROLS) algorithm is proposed for constructing parsimonious or sparse regression models that generalize well. By associating each orthogonal weight in the regression model with an individual regularization parameter, the ability for the orthogonal least squares (OLS) model selection to produce a very sparse model with good generalization performance is greatly enhanced. Furthermore, with the assistance of local regularization, when to terminate the model selection procedure becomes much clearer. This LROLS algorithm has computational advantages over the recently introduced relevance vector machine (RVM) method
- …