6,824 research outputs found

    Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism

    Get PDF
    Testing for the significance of a subset of regression coefficients in a linear model, a staple of statistical analysis, goes back at least to the work of Fisher who introduced the analysis of variance (ANOVA). We study this problem under the assumption that the coefficient vector is sparse, a common situation in modern high-dimensional settings. Suppose we have pp covariates and that under the alternative, the response only depends upon the order of p1αp^{1-\alpha} of those, 0α10\le\alpha\le1. Under moderate sparsity levels, that is, 0α1/20\le\alpha\le1/2, we show that ANOVA is essentially optimal under some conditions on the design. This is no longer the case under strong sparsity constraints, that is, α>1/2\alpha>1/2. In such settings, a multiple comparison procedure is often preferred and we establish its optimality when α3/4\alpha\geq3/4. However, these two very popular methods are suboptimal, and sometimes powerless, under moderately strong sparsity where 1/2<α<3/41/2<\alpha<3/4. We suggest a method based on the higher criticism that is powerful in the whole range α>1/2\alpha>1/2. This optimality property is true for a variety of designs, including the classical (balanced) multi-way designs and more modern "p>np>n" designs arising in genetics and signal processing. In addition to the standard fixed effects model, we establish similar results for a random effects model where the nonzero coefficients of the regression vector are normally distributed.Comment: Published in at http://dx.doi.org/10.1214/11-AOS910 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal

    Get PDF
    Unveiling meaningful geophysical information from seismic data requires to deal with both random and structured "noises". As their amplitude may be greater than signals of interest (primaries), additional prior information is especially important in performing efficient signal separation. We address here the problem of multiple reflections, caused by wave-field bouncing between layers. Since only approximate models of these phenomena are available, we propose a flexible framework for time-varying adaptive filtering of seismic signals, using sparse representations, based on inaccurate templates. We recast the joint estimation of adaptive filters and primaries in a new convex variational formulation. This approach allows us to incorporate plausible knowledge about noise statistics, data sparsity and slow filter variation in parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a constrained minimization problem that alleviates standard regularization issues in finding hyperparameters. The approach demonstrates significantly good performance in low signal-to-noise ratio conditions, both for simulated and real field seismic data

    High-dimensional change-point detection with sparse alternatives

    Full text link
    We consider the problem of detecting a change in mean in a sequence of Gaussian vectors. Under the alternative hypothesis, the change occurs only in some subset of the components of the vector. We propose a test of the presence of a change-point that is adaptive to the number of changing components. Under the assumption that the vector dimension tends to infinity and the length of the sequence grows slower than the dimension of the signal, we obtain the detection boundary for this problem and prove its rate-optimality

    Local Regularization Assisted Orthogonal Least Squares Regression

    No full text
    A locally regularized orthogonal least squares (LROLS) algorithm is proposed for constructing parsimonious or sparse regression models that generalize well. By associating each orthogonal weight in the regression model with an individual regularization parameter, the ability for the orthogonal least squares (OLS) model selection to produce a very sparse model with good generalization performance is greatly enhanced. Furthermore, with the assistance of local regularization, when to terminate the model selection procedure becomes much clearer. This LROLS algorithm has computational advantages over the recently introduced relevance vector machine (RVM) method
    corecore