Search CORE

268 research outputs found

A bias correction for the minimum error rate in cross-validation

Author: Tibshirani Robert
Tibshirani Ryan J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 20/08/2009
Field of study

Tuning parameters in supervised learning problems are often estimated by cross-validation. The minimum value of the cross-validation error can be biased downward as an estimate of the test error at that same value of the tuning parameter. We propose a simple method for the estimation of this bias that uses information from the cross-validation process. As a result, it requires essentially no additional computation. We apply our bias estimate to a number of popular classifiers in various settings, and examine its performance.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS224 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Exact Post-Selection Inference for Sequential Regression Procedures

Author: Lockhart Richard
Taylor Jonathan
Tibshirani Robert
Tibshirani Ryan J.
Publication venue
Publication date: 11/10/2015
Field of study

We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection event that can be characterized as y falling into a polyhedral set. This framework allows us to derive conditional (post-selection) hypothesis tests at any step of forward stepwise or least angle regression, or any step along the lasso regularization path, because, as it turns out, selection events for these procedures can be expressed as polyhedral constraints on y. The p-values associated with these tests are exactly uniform under the null distribution, in finite samples, yielding exact type I error control. The tests can also be inverted to produce confidence intervals for appropriate underlying regression parameters. The R package "selectiveInference", freely available on the CRAN repository, implements the new inference tools described in this paper.Comment: 26 pages, 5 figure

arXiv.org e-Print Archive

FigShare

Strong rules for discarding predictors in lasso-type problems

Author: Bien Jacob
Friedman Jerome
Hastie Trevor
Simon Noah
Taylor Jonathan
Tibshirani Robert
Tibshirani Ryan J.
Publication venue
Publication date: 24/11/2010
Field of study

We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui et al (2010) propose "SAFE" rules that guarantee that a coefficient will be zero in the solution, based on the inner products of each predictor with the outcome. In this paper we propose strong rules that are not foolproof but rarely fail in practice. These can be complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions to provide safe rules that offer substantial speed and space savings in a variety of statistical convex optimization problems.Comment:

arXiv.org e-Print Archive

CiteSeerX

An introduction to the bootstrap

Author: Efron Bradley
Tibshirani Robert J
Publication venue: Chapman and Hall
Publication date: 01/01/1993
Field of study

Crossref

CERN Document Server

Rejoinder to “A Significance Test for the Lasso”

Author: Jonathan Taylor
Richard Lockhart
Robert Tibshirani
Ryan J. Tibshirani
Publication venue
Publication date
Field of study

We would like to thank the editors and referees for their considerable efforts that improved our paper, and all of the discussants for their feedback, and their thoughtful and stimulating comments. Linear models are central in applied statistics, and inference for adaptive linear modeling is an important active area of research. Our paper is clearly not the last word on the subject! Several of the discussants introduce novel proposals for this problem; in fact, many of the discussions are interesting “mini-papers ” on their own, and we will not attempt to reply to all of the points that they raise. Our hope is that our paper and the excellent accompanying discussions will serve as a helpful resource for researchers interested in this topic. Since the writing of our original paper, we have (with many our of graduate students) extended the work considerably. Before responding to the discussants, we will first summarize this new work because it will be relevant to our responses. • As mentioned in the last section of the paper, we have derived a “spacing ” test of the global null hypothesis, β ∗ = 0, which takes the for

CiteSeerX