8,563 research outputs found

    Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons

    Full text link
    Consider the standard Gaussian linear regression model Y=Xθ+ϵY=X\theta+\epsilon, where YRnY\in R^n is a response vector and XRnp X\in R^{n*p} is a design matrix. Numerous work have been devoted to building efficient estimators of θ\theta when pp is much larger than nn. In such a situation, a classical approach amounts to assume that θ0\theta_0 is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of kk-sparse vectors θ\theta. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of XθX\theta), the inverse problem (estimation of θ0\theta_0) and linear testing (testing Xθ=0X\theta=0). Interestingly, an elbow effect occurs when the number of variables klog(p/k)k\log(p/k) becomes large compared to nn. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results

    Model-Robust Designs for Quantile Regression

    Full text link
    We give methods for the construction of designs for linear models, when the purpose of the investigation is the estimation of the conditional quantile function and the estimation method is quantile regression. The designs are robust against misspecified response functions, and against unanticipated heteroscedasticity. The methods are illustrated by example, and in a case study in which they are applied to growth charts

    Optimal inference in a class of regression models

    Get PDF
    We consider the problem of constructing confidence intervals (CIs) for a linear functional of a regression function, such as its value at a point, the regression discontinuity parameter, or a regression coefficient in a linear or partly linear regression. Our main assumption is that the regression function is known to lie in a convex function class, which covers most smoothness and/or shape assumptions used in econometrics. We derive finite-sample optimal CIs and sharp efficiency bounds under normal errors with known variance. We show that these results translate to uniform (over the function class) asymptotic results when the error distribution is not known. When the function class is centrosymmetric, these efficiency bounds imply that minimax CIs are close to efficient at smooth regression functions. This implies, in particular, that it is impossible to form CIs that are tighter using data-dependent tuning parameters, and maintain coverage over the whole function class. We specialize our results to inference on the regression discontinuity parameter, and illustrate them in simulations and an empirical application.Comment: 39 pages plus supplementary material
    corecore