Search CORE

5,567 research outputs found

Boosting for high-dimensional linear models

Author: Bühlmann Peter
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

We prove that boosting with the squared error loss,

L_2

Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as

O

(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the

\ell_1

-norm of the regression coefficients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the

\ell_1

-norm. We also propose here an

\mathit{AIC}

-based method for tuning, namely for choosing the number of boosting iterations. This makes

L_2

Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate

L_2

Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumor-classification problem with gene expression microarray data.Comment: Published at http://dx.doi.org/10.1214/009053606000000092 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Repository for Publications and Research Data

CiteSeerX

Crossref

A Nielsen theory for coincidences of iterates

Author: Heath Philip R.
Staecker P. Christopher
Publication venue
Publication date: 27/07/2011
Field of study

As the title suggests, this paper gives a Nielsen theory of coincidences of iterates of two self maps f, g of a closed manifold. The ideas is, as much as possible, to generalize Nielsen type periodic point theory, but there are many obstacles. Many times we get similar results to the "classical ones" in Nielsen periodic point theory, but with stronger hypotheses.Comment: 30 page

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Fairfield University: DigitalCommons@Fairfield

Early stopping and non-parametric regression: An optimal data-dependent stopping rule

Author: Raskutti Garvesh
Wainwright Martin J.
Yu Bin
Publication venue
Publication date: 15/06/2013
Field of study

The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy for a form of gradient-descent applied to the least-squares loss function. We propose a data-dependent stopping rule that does not involve hold-out or cross-validation data, and we prove upper bounds on the squared error of the resulting function estimate, measured in either the

L^2(P)

and

L^2(P_n)

norm. These upper bounds lead to minimax-optimal rates for various kernel classes, including Sobolev smoothness classes and other forms of reproducing kernel Hilbert spaces. We show through simulation that our stopping rule compares favorably to two other stopping rules, one based on hold-out data and the other based on Stein's unbiased risk estimate. We also establish a tight connection between our early stopping strategy and the solution path of a kernel ridge regression estimator.Comment: 29 pages, 4 figure

arXiv.org e-Print Archive

eScholarship - University of California

Concentration inequalities of the cross-validation estimate for stable predictors

Author: Cornec Matthieu
Publication venue
Publication date: 01/01/2010
Field of study

In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for stable predictors in the context of risk assessment. The notion of stability has been first introduced by \cite{DEWA79} and extended by \cite{KEA95}, \cite{BE01} and \cite{KUNIY02} to characterize class of predictors with infinite VC dimension. In particular, this covers

k

-nearest neighbors rules, bayesian algorithm (\cite{KEA95}), boosting,... General loss functions and class of predictors are considered. We use the formalism introduced by \cite{DUD03} to cover a large variety of cross-validation procedures including leave-one-out cross-validation,

k

-fold cross-validation, hold-out cross-validation (or split sample), and the leave-

\upsilon

-out cross-validation. In particular, we give a simple rule on how to choose the cross-validation, depending on the stability of the class of predictors. In the special case of uniform stability, an interesting consequence is that the number of elements in the test set is not required to grow to infinity for the consistency of the cross-validation procedure. In this special case, the particular interest of leave-one-out cross-validation is emphasized

arXiv.org e-Print Archive

CiteSeerX