97 research outputs found

    No unbiased Estimator of the Variance of K-Fold Cross-Validation

    Get PDF
    In statistical machine learning, the standard measure of accuracy for models is the prediction error, i.e. the expected loss on future examples. When the data distribution is unknown, it cannot be computed but several resampling methods, such as K-fold cross-validation can be used to obtain an unbiased estimator of prediction error. However, to compare learning algorithms one needs to also estimate the uncertainty around the cross-validation estimator, which is important because it can be very large. However, the usual variance estimates for means of independent samples cannot be used because of the reuse of the data used to form the cross-validation estimator. The main result of this paper is that there is no universal (distribution independent) unbiased estimator of the variance of the K-fold cross-validation estimator, based only on the empirical results of the error measurements obtained through the cross-validation procedure. The analysis provides a theoretical understanding showing the difficulty of this estimation. These results generalize to other resampling methods, as long as data are reused for training or testing. L'erreur de prĂ©diction, donc la perte attendue sur des donnĂ©es futures, est la mesure standard pour la qualitĂ© des modĂšles d'apprentissage statistique. Quand la distribution des donnĂ©es est inconnue, cette erreur ne peut ĂȘtre calculĂ©e mais plusieurs mĂ©thodes de rĂ©Ă©chantillonnage, comme la validation croisĂ©e, peuvent ĂȘtre utilisĂ©es pour obtenir un estimateur non-biaisĂ© de l'erreur de prĂ©diction. Cependant pour comparer des algorithmes d'apprentissage, il faut aussi estimer l'incertitude autour de cet estimateur d'erreur future, car cette incertitude peut ĂȘtre trĂšs grande. Cependant, les estimateurs ordinaires de variance d'une moyenne pour des Ă©chantillons indĂ©pendants ne peuvent ĂȘtre utilisĂ©s Ă  cause du recoupement des ensembles d'apprentissage utilisĂ©s pour effectuer la validation croisĂ©e. Le rĂ©sultat principal de cet article est qu'il n'existe pas d'estimateur non-biaisĂ© universel (indĂ©pendant de la distribution) de la variance de la validation croisĂ©e, en se basant sur les mesures d'erreur faites durant la validation croisĂ©e. L'analyse fournit une meilleure comprĂ©hension de la difficultĂ© d'estimer l'incertitude autour de la validation croisĂ©e. Ces rĂ©sultats se gĂ©nĂ©ralisent Ă  d'autres mĂ©thodes de rĂ©Ă©chantillonnage pour lesquelles des donnĂ©es sont rĂ©utilisĂ©es pour l'apprentissage ou le test.Prediction error, cross-validation, multivariate variance estimators, statistical comparison of algorithms, Erreur de prĂ©diction, validation croisĂ©e, estimateur de variance multivariĂ©e, comparaison statistique des algorithmes

    Stability-based model selection

    Get PDF
    Model selection is linked to model assessment, which is the problem of comparing different models, or model parameters, for a specific learning task. For supervised learning, the standard practical technique is crossvalidation, which is not applicable for semi-supervised and unsupervised settings. In this paper, a new model assessment scheme is introduced which is based on a notion of stability. The stability measure yields an upper bound to cross-validation in the supervised case, but extends to semi-supervised and unsupervised problems. In the experimental part, the performance of the stability measure is studied for model order selection in comparison to standard techniques in this area.

    Stability

    Full text link
    Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to "reasonable" perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, while robust statistics methods deal with perturbations to models. In this article, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability for interpretable and reliable encoding models from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statistical inference, such as sensitivity analysis and effect detection. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performance across over 2,000 voxels. Last, a novel "stability" argument is seen to drive new results that shed light on the intriguing interactions between sample to sample variability and heavier tail error distribution (e.g., double-exponential) in high-dimensional regression models with pp predictors and nn independent samples. In particular, when p/n→Îș∈(0.3,1)p/n\rightarrow\kappa\in(0.3,1) and the error distribution is double-exponential, the Ordinary Least Squares (OLS) is a better estimator than the Least Absolute Deviation (LAD) estimator.Comment: Published in at http://dx.doi.org/10.3150/13-BEJSP14 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
    • 

    corecore