4,641 research outputs found
The Rate of Convergence of AdaBoost
The AdaBoost algorithm was designed to combine many "weak" hypotheses that
perform slightly better than random guessing into a "strong" hypothesis that
has very low error. We study the rate at which AdaBoost iteratively converges
to the minimum of the "exponential loss." Unlike previous work, our proofs do
not require a weak-learning assumption, nor do they require that minimizers of
the exponential loss are finite. Our first result shows that at iteration ,
the exponential loss of AdaBoost's computed parameter vector will be at most
more than that of any parameter vector of -norm bounded by
in a number of rounds that is at most a polynomial in and .
We also provide lower bounds showing that a polynomial dependence on these
parameters is necessary. Our second result is that within
iterations, AdaBoost achieves a value of the exponential loss that is at most
more than the best possible value, where depends on the dataset.
We show that this dependence of the rate on is optimal up to
constant factors, i.e., at least rounds are necessary to
achieve within of the optimal exponential loss.Comment: A preliminary version will appear in COLT 201
The Rate of Convergence of AdaBoost
The AdaBoost algorithm was designed to combine many “weak” hypotheses that perform slightly better than random guessing into a “strong” hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the “exponential loss”. Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that the exponential loss of AdaBoost's computed parameter vector will be at most ε more than that of any parameter vector of ℓ[subscript 1]-norm bounded by B in a number of rounds that is at most a polynomial in B and 1/ε. We also provide lower bounds showing that a polynomial dependence is necessary. Our second result is that within C/ε iterations, AdaBoost achieves a value of the exponential loss that is at most ε more than the best possible value, where C depends on the data set. We show that this dependence of the rate on ε is optimal up to constant factors, that is, at least Ω(1/ε) rounds are necessary to achieve within ε of the optimal exponential loss.National Science Foundation (U.S.) (Grant IIS-1016029)National Science Foundation (U.S.) (Grant IIS-1053407
SelfieBoost: A Boosting Algorithm for Deep Learning
We describe and analyze a new boosting algorithm for deep learning called
SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct
ensembles of classifiers, SelfieBoost boosts the accuracy of a single network.
We prove a convergence rate for SelfieBoost under some "SGD
success" assumption which seems to hold in practice
Accelerated face detector training using the PSL framework
We train a face detection system using the PSL framework [1] which combines the AdaBoost
learning algorithm and Haar-like features. We demonstrate the ability of this framework to
overcome some of the challenges inherent in training classifiers that are structured in cascades
of boosted ensembles (CoBE). The PSL classifiers are compared to the Viola-Jones type cas-
caded classifiers. We establish the ability of the PSL framework to produce classifiers in a
complex domain in significantly reduced time frame. They also comprise of fewer boosted en-
sembles albeit at a price of increased false detection rates on our test dataset. We also report
on results from a more diverse number of experiments carried out on the PSL framework in
order to shed more insight into the effects of variations in its adjustable training parameters
Parallel coordinate descent for the Adaboost problem
We design a randomised parallel version of Adaboost based on previous studies
on parallel coordinate descent. The algorithm uses the fact that the logarithm
of the exponential loss is a function with coordinate-wise Lipschitz continuous
gradient, in order to define the step lengths. We provide the proof of
convergence for this randomised Adaboost algorithm and a theoretical
parallelisation speedup factor. We finally provide numerical examples on
learning problems of various sizes that show that the algorithm is competitive
with concurrent approaches, especially for large scale problems.Comment: 7 pages, 3 figures, extended version of the paper presented to
ICMLA'1
Studies of Boosted Decision Trees for MiniBooNE Particle Identification
Boosted decision trees are applied to particle identification in the
MiniBooNE experiment operated at Fermi National Accelerator Laboratory
(Fermilab) for neutrino oscillations. Numerous attempts are made to tune the
boosted decision trees, to compare performance of various boosting algorithms,
and to select input variables for optimal performance.Comment: 28 pages, 22 figures, submitted to Nucl. Inst & Meth.
- …