57 research outputs found
Data-driven calibration of penalties for least-squares regression
Penalization procedures often suffer from their dependence on multiplying
factors, whose optimal values are either unknown or hard to estimate from the
data. We propose a completely data-driven calibration algorithm for this
parameter in the least-squares regression framework, without assuming a
particular shape for the penalty. Our algorithm relies on the concept of
minimal penalty, recently introduced by Birge and Massart (2007) in the context
of penalized least squares for Gaussian homoscedastic regression. On the
positive side, the minimal penalty can be evaluated from the data themselves,
leading to a data-driven estimation of an optimal penalty which can be used in
practice; on the negative side, their approach heavily relies on the
homoscedastic Gaussian nature of their stochastic framework. The purpose of
this paper is twofold: stating a more general heuristics for designing a
data-driven penalty (the slope heuristics) and proving that it works for
penalized least-squares regression with a random design, even for
heteroscedastic non-Gaussian data. For technical reasons, some exact
mathematical results will be proved only for regressogram bin-width selection.
This is at least a first step towards further results, since the approach and
the method that we use are indeed general
Data driven estimation of Laplace-Beltrami operator
Approximations of Laplace-Beltrami operators on manifolds through graph
Lapla-cians have become popular tools in data analysis and machine learning.
These discretized operators usually depend on bandwidth parameters whose tuning
remains a theoretical and practical problem. In this paper, we address this
problem for the unnormalized graph Laplacian by establishing an oracle
inequality that opens the door to a well-founded data-driven procedure for the
bandwidth selection. Our approach relies on recent results by Lacour and
Massart [LM15] on the so-called Lepski's method
Adaptive non-asymptotic confidence balls in density estimation
We build confidence balls for the common density of a real valued sample
. We use resampling methods to estimate the projection of onto
finite dimensional linear spaces and a model selection procedure to choose an
optimal approximation space. The covering property is ensured for all
and the balls are adaptive over a collection of linear spaces
Estimator selection: a new method with applications to kernel density estimation
Estimator selection has become a crucial issue in non parametric estimation.
Two widely used methods are penalized empirical risk minimization (such as
penalized log-likelihood estimation) or pairwise comparison (such as Lepski's
method). Our aim in this paper is twofold. First we explain some general ideas
about the calibration issue of estimator selection methods. We review some
known results, putting the emphasis on the concept of minimal penalty which is
helpful to design data-driven selection criteria. Secondly we present a new
method for bandwidth selection within the framework of kernel density density
estimation which is in some sense intermediate between these two main methods
mentioned above. We provide some theoretical results which lead to some fully
data-driven selection strategy
Bandwidth selection in kernel empirical risk minimization via the gradient
In this paper, we deal with the data-driven selection of multidimensional and
possibly anisotropic bandwidths in the general framework of kernel empirical
risk minimization. We propose a universal selection rule, which leads to
optimal adaptive results in a large variety of statistical models such as
nonparametric robust regression and statistical learning with errors in
variables. These results are stated in the context of smooth loss functions,
where the gradient of the risk appears as a good criterion to measure the
performance of our estimators. The selection rule consists of a comparison of
gradient empirical risks. It can be viewed as a nontrivial improvement of the
so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one
main advantage of our selection rule is the nondependency on the Hessian matrix
of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …