1,097 research outputs found

    Pivotal estimation via square-root Lasso in nonparametric regression

    Get PDF
    We propose a self-tuning Lasso\sqrt{\mathrm {Lasso}} method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even in extreme cases, such as the infinite variance case and the noiseless case, in contrast to Lasso. We establish various nonasymptotic bounds for Lasso\sqrt{\mathrm {Lasso}} including prediction norm rate and sparsity. Our analysis is based on new impact factors that are tailored for bounding prediction norm. In order to cover heteroscedastic non-Gaussian noise, we rely on moderate deviation theory for self-normalized sums to achieve Gaussian-like results under weak conditions. Moreover, we derive bounds on the performance of ordinary least square (ols) applied to the model selected by Lasso\sqrt{\mathrm {Lasso}} accounting for possible misspecification of the selected model. Under mild conditions, the rate of convergence of ols post Lasso\sqrt{\mathrm {Lasso}} is as good as Lasso\sqrt{\mathrm {Lasso}}'s rate. As an application, we consider the use of Lasso\sqrt{\mathrm {Lasso}} and ols post Lasso\sqrt{\mathrm {Lasso}} as estimators of nuisance parameters in a generic semiparametric problem (nonlinear moment condition or ZZ-problem), resulting in a construction of n\sqrt{n}-consistent and asymptotically normal estimators of the main parameters.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1204 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation

    Get PDF
    A constrained L1 minimization method is proposed for estimating a sparse inverse covariance matrix based on a sample of nn iid pp-variate random variables. The resulting estimator is shown to enjoy a number of desirable properties. In particular, it is shown that the rate of convergence between the estimator and the true ss-sparse precision matrix under the spectral norm is slogp/ns\sqrt{\log p/n} when the population distribution has either exponential-type tails or polynomial-type tails. Convergence rates under the elementwise LL_{\infty} norm and Frobenius norm are also presented. In addition, graphical model selection is considered. The procedure is easily implementable by linear programming. Numerical performance of the estimator is investigated using both simulated and real data. In particular, the procedure is applied to analyze a breast cancer dataset. The procedure performs favorably in comparison to existing methods.Comment: To appear in Journal of the American Statistical Associatio

    Inference for High-Dimensional Sparse Econometric Models

    Full text link
    This article is about estimation and inference methods for high dimensional sparse (HDS) regression models in econometrics. High dimensional sparse models arise in situations where many regressors (or series terms) are available and the regression function is well-approximated by a parsimonious, yet unknown set of regressors. The latter condition makes it possible to estimate the entire regression function effectively by searching for approximately the right set of regressors. We discuss methods for identifying this set of regressors and estimating their coefficients based on 1\ell_1-penalization and describe key theoretical results. In order to capture realistic practical situations, we expressly allow for imperfect selection of regressors and study the impact of this imperfect selection on estimation and inference results. We focus the main part of the article on the use of HDS models and methods in the instrumental variables model and the partially linear model. We present a set of novel inference results for these models and illustrate their use with applications to returns to schooling and growth regression

    Group Lasso for high dimensional sparse quantile regression models

    Full text link
    This paper studies the statistical properties of the group Lasso estimator for high dimensional sparse quantile regression models where the number of explanatory variables (or the number of groups of explanatory variables) is possibly much larger than the sample size while the number of variables in "active" groups is sufficiently small. We establish a non-asymptotic bound on the 2\ell_{2}-estimation error of the estimator. This bound explains situations under which the group Lasso estimator is potentially superior/inferior to the 1\ell_{1}-penalized quantile regression estimator in terms of the estimation error. We also propose a data-dependent choice of the tuning parameter to make the method more practical, by extending the original proposal of Belloni and Chernozhukov (2011) for the 1\ell_{1}-penalized quantile regression estimator. As an application, we analyze high dimensional additive quantile regression models. We show that under a set of suitable regularity conditions, the group Lasso estimator can attain the convergence rate arbitrarily close to the oracle rate. Finally, we conduct simulations experiments to examine our theoretical results.Comment: 37 pages. Some errors are correcte

    Inference for high-dimensional sparse econometric models

    Get PDF
    This article is about estimation and inference methods for high dimensional sparse (HDS) regression models in econometrics. High dimensional sparse models arise in situations where many regressors (or series terms) are available and the regression function is well-approximated by a parsimonious, yet unknown set of regressors. The latter condition makes it possible to estimate the entire regression function effectively by searching for approximately the right set of regressors. We discuss methods for identifying this set of regressors and estimating their coefficients based on l1 -penalization and describe key theoretical results. In order to capture realistic practical situations, we expressly allow for imperfect selection of regressors and study the impact of this imperfect selection on estimation and inference results. We focus the main part of the article on the use of HDS models and methods in the instrumental variables model and the partially linear model. We present a set of novel inference results for these models and illustrate their use with applications to returns to schooling and growth regression.

    L1-Penalized quantile regression in high-dimensional sparse models

    Get PDF
    We consider median regression and, more generally, quantile regression in high-dimensional sparse models. In these models the overall number of regressors p is very large, possibly larger than the sample size n, but only s of these regressors have non-zero impact on the conditional quantile of the response variable, where s grows slower than n. Since in this case the ordinary quantile regression is not consistent, we consider quantile regression penalized by the L1-norm of coefficients (L1-QR). First, we show that L1-QR is consistent at the rate of the square root of (s/n) log p, which is close to the oracle rate of the square root of (s/n), achievable when the minimal true model is known. The overall number of regressors p affects the rate only through the log p factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that s/n converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that L1-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in L1-QR is of same stochastic order as s, the number of non-zero coefficients in the minimal true model. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of L1-QR in a Monte-Carlo experiment, and provide an application to the analysis of the international economic growth.
    corecore