5,567 research outputs found

    Boosting for high-dimensional linear models

    Full text link
    We prove that boosting with the squared error loss, L2L_2Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as OO(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the â„“1\ell_1-norm of the regression coefficients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the â„“1\ell_1-norm. We also propose here an AIC\mathit{AIC}-based method for tuning, namely for choosing the number of boosting iterations. This makes L2L_2Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate L2L_2Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumor-classification problem with gene expression microarray data.Comment: Published at http://dx.doi.org/10.1214/009053606000000092 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Nielsen theory for coincidences of iterates

    Get PDF
    As the title suggests, this paper gives a Nielsen theory of coincidences of iterates of two self maps f, g of a closed manifold. The ideas is, as much as possible, to generalize Nielsen type periodic point theory, but there are many obstacles. Many times we get similar results to the "classical ones" in Nielsen periodic point theory, but with stronger hypotheses.Comment: 30 page

    Early stopping and non-parametric regression: An optimal data-dependent stopping rule

    Full text link
    The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy for a form of gradient-descent applied to the least-squares loss function. We propose a data-dependent stopping rule that does not involve hold-out or cross-validation data, and we prove upper bounds on the squared error of the resulting function estimate, measured in either the L2(P)L^2(P) and L2(Pn)L^2(P_n) norm. These upper bounds lead to minimax-optimal rates for various kernel classes, including Sobolev smoothness classes and other forms of reproducing kernel Hilbert spaces. We show through simulation that our stopping rule compares favorably to two other stopping rules, one based on hold-out data and the other based on Stein's unbiased risk estimate. We also establish a tight connection between our early stopping strategy and the solution path of a kernel ridge regression estimator.Comment: 29 pages, 4 figure

    Concentration inequalities of the cross-validation estimate for stable predictors

    Full text link
    In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for stable predictors in the context of risk assessment. The notion of stability has been first introduced by \cite{DEWA79} and extended by \cite{KEA95}, \cite{BE01} and \cite{KUNIY02} to characterize class of predictors with infinite VC dimension. In particular, this covers kk-nearest neighbors rules, bayesian algorithm (\cite{KEA95}), boosting,... General loss functions and class of predictors are considered. We use the formalism introduced by \cite{DUD03} to cover a large variety of cross-validation procedures including leave-one-out cross-validation, kk-fold cross-validation, hold-out cross-validation (or split sample), and the leave-Ï…\upsilon-out cross-validation. In particular, we give a simple rule on how to choose the cross-validation, depending on the stability of the class of predictors. In the special case of uniform stability, an interesting consequence is that the number of elements in the test set is not required to grow to infinity for the consistency of the cross-validation procedure. In this special case, the particular interest of leave-one-out cross-validation is emphasized
    • …
    corecore