Search CORE

4,298 research outputs found

Covariance regularization by thresholding

Author: Bickel Peter J.
Levina Elizaveta
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

This paper considers regularizing a covariance matrix of

p

variables estimated from

n

observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and

(\log p)/n\to0

, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data.Comment: Published in at http://dx.doi.org/10.1214/08-AOS600 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Minimax bounds for sparse PCA with noisy high-dimensional data

Author: Birnbaum Aharon
Johnstone Iain M.
Nadler Boaz
Paul Debashis
Publication venue
Publication date: 01/01/2012
Field of study

We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish a lower bound on the minimax risk of estimators under the

l_2

loss, in the joint limit as dimension and sample size increase to infinity, under various models of sparsity for the population eigenvectors. The lower bound on the risk points to the existence of different regimes of sparsity of the eigenvectors. We also propose a new method for estimating the eigenvectors by a two-stage coordinate selection scheme.Comment: 1 figur

arXiv.org e-Print Archive

CiteSeerX

PubMed Central

eScholarship - University of California

Sparse PCA: Optimal rates and adaptive estimation

Author: Cai T. Tony
Ma Zongming
Wu Yihong
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano's lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1178 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn