Search CORE

5 research outputs found

Recommended from our members

Linear Time Nonparametric Classification and Feature Selection with Polynomial MPMC Cascades for Large Datasets ; CU-CS-977-04

Author: Bohte Sander
Breitenbach Markus
Grudic Gregory
Publication venue: CU Scholar
Publication date: 01/05/2004
Field of study

CU Scholar Institutional Repository

Quickly Boosting Decision Trees - Pruning Underachieving Features Early

Author: Appel Ron
Dollár Piotr
Fuchs Thomas
Perona Pietro
Publication venue: JMLR
Publication date: 01/01/2013
Field of study

Boosted decision trees are one of the most popular and successful learning techniques used today. While exhibiting fast speeds at test time, relatively slow training makes them impractical for applications with real-time learning requirements. We propose a principled approach to overcome this drawback. We prove a bound on the error of a decision stump given its preliminary error on a subset of the training data; the bound may be used to prune unpromising features early on in the training process. We propose a fast training algorithm that exploits this bound, yielding speedups of an order of magnitude at no cost in the final performance of the classifier. Our method is not a new variant of Boosting; rather, it may be used in conjunction with existing Boosting algorithms and other sampling heuristics to achieve even greater speedups

Caltech Authors

Faster Boosting with Smaller Memory

Author: Alafate Julaiti
Freund Yoav
Publication venue
Publication date: 01/01/2019
Field of study

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.Comment: NeurIPS 201

arXiv.org e-Print Archive

eScholarship - University of California

Scaling Up a Boosting-Based Learner via Adaptive Sampling

Author: E. Keogh
J. R. Quinlan
O. Watanabe
P. Domingos
R. E. Schapire
R. J. Lipton
R. J. Lipton
R.C. Holte
T. Dietterich
Y. Freund
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref