238 research outputs found

    High-dimensional stochastic optimization with the generalized Dantzig estimator

    Get PDF
    We propose a generalized version of the Dantzig selector. We show that it satisfies sparsity oracle inequalities in prediction and estimation. We consider then the particular case of high-dimensional linear regression model selection with the Huber loss function. In this case we derive the sup-norm convergence rate and the sign concentration property of the Dantzig estimators under a mutual coherence assumption on the dictionary

    Asymptotics and Concentration Bounds for Bilinear Forms of Spectral Projectors of Sample Covariance

    Full text link
    Let X,X1,…,XnX,X_1,\dots, X_n be i.i.d. Gaussian random variables with zero mean and covariance operator Σ=E(X⊗X)\Sigma={\mathbb E}(X\otimes X) taking values in a separable Hilbert space H.{\mathbb H}. Let r(Σ):=tr(Σ)∥Σ∥∞ {\bf r}(\Sigma):=\frac{{\rm tr}(\Sigma)}{\|\Sigma\|_{\infty}} be the effective rank of Σ,\Sigma, tr(Σ){\rm tr}(\Sigma) being the trace of Σ\Sigma and ∥Σ∥∞\|\Sigma\|_{\infty} being its operator norm. Let Σ^n:=n−1∑j=1n(Xj⊗Xj)\hat \Sigma_n:=n^{-1}\sum_{j=1}^n (X_j\otimes X_j) be the sample (empirical) covariance operator based on (X1,…,Xn).(X_1,\dots, X_n). The paper deals with a problem of estimation of spectral projectors of the covariance operator Σ\Sigma by their empirical counterparts, the spectral projectors of Σ^n\hat \Sigma_n (empirical spectral projectors). The focus is on the problems where both the sample size nn and the effective rank r(Σ){\bf r}(\Sigma) are large. This framework includes and generalizes well known high-dimensional spiked covariance models. Given a spectral projector PrP_r corresponding to an eigenvalue μr\mu_r of covariance operator Σ\Sigma and its empirical counterpart P^r,\hat P_r, we derive sharp concentration bounds for bilinear forms of empirical spectral projector P^r\hat P_r in terms of sample size nn and effective dimension r(Σ).{\bf r}(\Sigma). Building upon these concentration bounds, we prove the asymptotic normality of bilinear forms of random operators P^r−EP^r\hat P_r -{\mathbb E}\hat P_r under the assumptions that n→∞n\to \infty and r(Σ)=o(n).{\bf r}(\Sigma)=o(n). In a special case of eigenvalues of multiplicity one, these results are rephrased as concentration bounds and asymptotic normality for linear forms of empirical eigenvectors. Other results include bounds on the bias EP^r−Pr{\mathbb E}\hat P_r-P_r and a method of bias reduction as well as a discussion of possible applications to statistical inference in high-dimensional principal component analysis

    Pac-bayesian bounds for sparse regression estimation with exponential weights

    Get PDF
    We consider the sparse regression model where the number of parameters pp is larger than the sample size nn. The difficulty when considering high-dimensional problems is to propose estimators achieving a good compromise between statistical and computational performances. The BIC estimator for instance performs well from the statistical point of view \cite{BTW07} but can only be computed for values of pp of at most a few tens. The Lasso estimator is solution of a convex minimization problem, hence computable for large value of pp. However stringent conditions on the design are required to establish fast rates of convergence for this estimator. Dalalyan and Tsybakov \cite{arnak} propose a method achieving a good compromise between the statistical and computational aspects of the problem. Their estimator can be computed for reasonably large pp and satisfies nice statistical properties under weak assumptions on the design. However, \cite{arnak} proposes sparsity oracle inequalities in expectation for the empirical excess risk only. In this paper, we propose an aggregation procedure similar to that of \cite{arnak} but with improved statistical performances. Our main theoretical result is a sparsity oracle inequality in probability for the true excess risk for a version of exponential weight estimator. We also propose a MCMC method to compute our estimator for reasonably large values of pp.Comment: 19 page
    • …
    corecore