5,898 research outputs found
Consistency of cross validation for comparing regression procedures
Theoretical developments on cross validation (CV) have mainly focused on
selecting one among a list of finite-dimensional models (e.g., subset or order
selection in linear regression) or selecting a smoothing parameter (e.g.,
bandwidth for kernel smoothing). However, little is known about consistency of
cross validation when applied to compare between parametric and nonparametric
methods or within nonparametric methods. We show that under some conditions,
with an appropriate choice of data splitting ratio, cross validation is
consistent in the sense of selecting the better procedure with probability
approaching 1. Our results reveal interesting behavior of cross validation.
When comparing two models (procedures) converging at the same nonparametric
rate, in contrast to the parametric case, it turns out that the proportion of
data used for evaluation in CV does not need to be dominating in size.
Furthermore, it can even be of a smaller order than the proportion for
estimation while not affecting the consistency property.Comment: Published in at http://dx.doi.org/10.1214/009053607000000514 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …