2,055 research outputs found
Optimal inference in a class of regression models
We consider the problem of constructing confidence intervals (CIs) for a
linear functional of a regression function, such as its value at a point, the
regression discontinuity parameter, or a regression coefficient in a linear or
partly linear regression. Our main assumption is that the regression function
is known to lie in a convex function class, which covers most smoothness and/or
shape assumptions used in econometrics. We derive finite-sample optimal CIs and
sharp efficiency bounds under normal errors with known variance. We show that
these results translate to uniform (over the function class) asymptotic results
when the error distribution is not known. When the function class is
centrosymmetric, these efficiency bounds imply that minimax CIs are close to
efficient at smooth regression functions. This implies, in particular, that it
is impossible to form CIs that are tighter using data-dependent tuning
parameters, and maintain coverage over the whole function class. We specialize
our results to inference on the regression discontinuity parameter, and
illustrate them in simulations and an empirical application.Comment: 39 pages plus supplementary material
On Degrees of Freedom of Projection Estimators with Applications to Multivariate Nonparametric Regression
In this paper, we consider the nonparametric regression problem with
multivariate predictors. We provide a characterization of the degrees of
freedom and divergence for estimators of the unknown regression function, which
are obtained as outputs of linearly constrained quadratic optimization
procedures, namely, minimizers of the least squares criterion with linear
constraints and/or quadratic penalties. As special cases of our results, we
derive explicit expressions for the degrees of freedom in many nonparametric
regression problems, e.g., bounded isotonic regression, multivariate
(penalized) convex regression, and additive total variation regularization. Our
theory also yields, as special cases, known results on the degrees of freedom
of many well-studied estimators in the statistics literature, such as ridge
regression, Lasso and generalized Lasso. Our results can be readily used to
choose the tuning parameter(s) involved in the estimation procedure by
minimizing the Stein's unbiased risk estimate. As a by-product of our analysis
we derive an interesting connection between bounded isotonic regression and
isotonic regression on a general partially ordered set, which is of independent
interest.Comment: 72 pages, 7 figures, Journal of the American Statistical Association
(Theory and Methods), 201
Nonparametric estimation by convex programming
The problem we concentrate on is as follows: given (1) a convex compact set
in , an affine mapping , a parametric family
of probability densities and (2) i.i.d. observations
of the random variable , distributed with the density
for some (unknown) , estimate the value of a given linear form
at . For several families with no additional
assumptions on and , we develop computationally efficient estimation
routines which are minimax optimal, within an absolute constant factor. We then
apply these routines to recovering itself in the Euclidean norm.Comment: Published in at http://dx.doi.org/10.1214/08-AOS654 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
An adaptation theory for nonparametric confidence intervals
A nonparametric adaptation theory is developed for the construction of
confidence intervals for linear functionals. A between class modulus of
continuity captures the expected length of adaptive confidence intervals. Sharp
lower bounds are given for the expected length and an ordered modulus of
continuity is used to construct adaptive confidence procedures which are within
a constant factor of the lower bounds. In addition, minimax theory over
nonconvex parameter spaces is developed.Comment: Published at http://dx.doi.org/10.1214/009053604000000049 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces
Deep learning has been applied to various tasks in the field of machine
learning and has shown superiority to other common procedures such as kernel
methods. To provide a better theoretical understanding of the reasons for its
success, we discuss the performance of deep learning and other methods on a
nonparametric regression problem with a Gaussian noise. Whereas existing
theoretical studies of deep learning have been based mainly on mathematical
theories of well-known function classes such as H\"{o}lder and Besov classes,
we focus on function classes with discontinuity and sparsity, which are those
naturally assumed in practice. To highlight the effectiveness of deep learning,
we compare deep learning with a class of linear estimators representative of a
class of shallow estimators. It is shown that the minimax risk of a linear
estimator on the convex hull of a target function class does not differ from
that of the original target function class. This results in the suboptimality
of linear methods over a simple but non-convex function class, on which deep
learning can attain nearly the minimax-optimal rate. In addition to this
extreme case, we consider function classes with sparse wavelet coefficients. On
these function classes, deep learning also attains the minimax rate up to log
factors of the sample size, and linear methods are still suboptimal if the
assumed sparsity is strong. We also point out that the parameter sharing of
deep neural networks can remarkably reduce the complexity of the model in our
setting.Comment: 33 page
- …