8,563 research outputs found
Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons
Consider the standard Gaussian linear regression model ,
where is a response vector and is a design matrix.
Numerous work have been devoted to building efficient estimators of
when is much larger than . In such a situation, a classical approach
amounts to assume that is approximately sparse. This paper studies
the minimax risks of estimation and testing over classes of -sparse vectors
. These bounds shed light on the limitations due to
high-dimensionality. The results encompass the problem of prediction
(estimation of ), the inverse problem (estimation of ) and
linear testing (testing ). Interestingly, an elbow effect occurs
when the number of variables becomes large compared to .
Indeed, the minimax risks and hypothesis separation distances blow up in this
ultra-high dimensional setting. We also prove that even dimension reduction
techniques cannot provide satisfying results in an ultra-high dimensional
setting. Moreover, we compute the minimax risks when the variance of the noise
is unknown. The knowledge of this variance is shown to play a significant role
in the optimal rates of estimation and testing. All these minimax bounds
provide a characterization of statistical problems that are so difficult so
that no procedure can provide satisfying results
Model-Robust Designs for Quantile Regression
We give methods for the construction of designs for linear models, when the
purpose of the investigation is the estimation of the conditional quantile
function and the estimation method is quantile regression. The designs are
robust against misspecified response functions, and against unanticipated
heteroscedasticity. The methods are illustrated by example, and in a case study
in which they are applied to growth charts
Optimal inference in a class of regression models
We consider the problem of constructing confidence intervals (CIs) for a
linear functional of a regression function, such as its value at a point, the
regression discontinuity parameter, or a regression coefficient in a linear or
partly linear regression. Our main assumption is that the regression function
is known to lie in a convex function class, which covers most smoothness and/or
shape assumptions used in econometrics. We derive finite-sample optimal CIs and
sharp efficiency bounds under normal errors with known variance. We show that
these results translate to uniform (over the function class) asymptotic results
when the error distribution is not known. When the function class is
centrosymmetric, these efficiency bounds imply that minimax CIs are close to
efficient at smooth regression functions. This implies, in particular, that it
is impossible to form CIs that are tighter using data-dependent tuning
parameters, and maintain coverage over the whole function class. We specialize
our results to inference on the regression discontinuity parameter, and
illustrate them in simulations and an empirical application.Comment: 39 pages plus supplementary material
- …