Search CORE

248 research outputs found

High-dimensional stochastic optimization with the generalized Dantzig estimator

Author: Lounici Karim
Publication venue
Publication date: 13/11/2008
Field of study

We propose a generalized version of the Dantzig selector. We show that it satisfies sparsity oracle inequalities in prediction and estimation. We consider then the particular case of high-dimensional linear regression model selection with the Huber loss function. In this case we derive the sup-norm convergence rate and the sign concentration property of the Dantzig estimators under a mutual coherence assumption on the dictionary

arXiv.org e-Print Archive

Hal-Diderot

Asymptotics and Concentration Bounds for Bilinear Forms of Spectral Projectors of Sample Covariance

Author: Koltchinskii Vladimir
Lounici Karim
Publication venue
Publication date: 07/08/2015
Field of study

Let

X,X_1,\dots, X_n

be i.i.d. Gaussian random variables with zero mean and covariance operator

\Sigma={\mathbb E}(X\otimes X)

taking values in a separable Hilbert space

{\mathbb H}.

Let

{\bf r}(\Sigma):=\frac{{\rm tr}(\Sigma)}{\|\Sigma\|_{\infty}}

be the effective rank of

\Sigma,

{\rm tr}(\Sigma)

being the trace of

\Sigma

and

\|\Sigma\|_{\infty}

being its operator norm. Let

\hat \Sigma_n:=n^{-1}\sum_{j=1}^n (X_j\otimes X_j)

be the sample (empirical) covariance operator based on

(X_1,\dots, X_n).

The paper deals with a problem of estimation of spectral projectors of the covariance operator

\Sigma

by their empirical counterparts, the spectral projectors of

\hat \Sigma_n

(empirical spectral projectors). The focus is on the problems where both the sample size

n

and the effective rank

{\bf r}(\Sigma)

are large. This framework includes and generalizes well known high-dimensional spiked covariance models. Given a spectral projector

P_r

corresponding to an eigenvalue

\mu_r

of covariance operator

\Sigma

and its empirical counterpart

\hat P_r,

we derive sharp concentration bounds for bilinear forms of empirical spectral projector

\hat P_r

in terms of sample size

n

and effective dimension

{\bf r}(\Sigma).

Building upon these concentration bounds, we prove the asymptotic normality of bilinear forms of random operators

\hat P_r -{\mathbb E}\hat P_r

under the assumptions that

n\to \infty

and

{\bf r}(\Sigma)=o(n).

In a special case of eigenvalues of multiplicity one, these results are rephrased as concentration bounds and asymptotic normality for linear forms of empirical eigenvectors. Other results include bounds on the bias

{\mathbb E}\hat P_r-P_r

and a method of bias reduction as well as a discussion of possible applications to statistical inference in high-dimensional principal component analysis

arXiv.org e-Print Archive

HAL-Polytechnique

Pac-bayesian bounds for sparse regression estimation with exponential weights

Author: Alquier Pierre
Lounici Karim
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/08/2010
Field of study

We consider the sparse regression model where the number of parameters

p

is larger than the sample size

n

. The difficulty when considering high-dimensional problems is to propose estimators achieving a good compromise between statistical and computational performances. The BIC estimator for instance performs well from the statistical point of view \cite{BTW07} but can only be computed for values of

p

of at most a few tens. The Lasso estimator is solution of a convex minimization problem, hence computable for large value of

p

. However stringent conditions on the design are required to establish fast rates of convergence for this estimator. Dalalyan and Tsybakov \cite{arnak} propose a method achieving a good compromise between the statistical and computational aspects of the problem. Their estimator can be computed for reasonably large

p

and satisfies nice statistical properties under weak assumptions on the design. However, \cite{arnak} proposes sparsity oracle inequalities in expectation for the empirical excess risk only. In this paper, we propose an aggregation procedure similar to that of \cite{arnak} but with improved statistical performances. Our main theoretical result is a sparsity oracle inequality in probability for the true excess risk for a version of exponential weight estimator. We also propose a MCMC method to compute our estimator for reasonably large values of

p

.Comment: 19 page

arXiv.org e-Print Archive

Hal-Diderot

HAL-Polytechnique