16 research outputs found
A Primal-Dual Convergence Analysis of Boosting
Boosting combines weak learners into a predictor with low empirical risk. Its
dual constructs a high entropy distribution upon which weak learners and
training labels are uncorrelated. This manuscript studies this primal-dual
relationship under a broad family of losses, including the exponential loss of
AdaBoost and the logistic loss, revealing:
- Weak learnability aids the whole loss family: for any {\epsilon}>0,
O(ln(1/{\epsilon})) iterations suffice to produce a predictor with empirical
risk {\epsilon}-close to the infimum;
- The circumstances granting the existence of an empirical risk minimizer may
be characterized in terms of the primal and dual problems, yielding a new proof
of the known rate O(ln(1/{\epsilon}));
- Arbitrary instances may be decomposed into the above two, granting rate
O(1/{\epsilon}), with a matching lower bound provided for the logistic loss.Comment: 40 pages, 8 figures; the NIPS 2011 submission "The Fast Convergence
of Boosting" is a brief presentation of the primary results; compared with
the JMLR version, this arXiv version has hyperref and some formatting tweak
Parallel coordinate descent for the Adaboost problem
We design a randomised parallel version of Adaboost based on previous studies
on parallel coordinate descent. The algorithm uses the fact that the logarithm
of the exponential loss is a function with coordinate-wise Lipschitz continuous
gradient, in order to define the step lengths. We provide the proof of
convergence for this randomised Adaboost algorithm and a theoretical
parallelisation speedup factor. We finally provide numerical examples on
learning problems of various sizes that show that the algorithm is competitive
with concurrent approaches, especially for large scale problems.Comment: 7 pages, 3 figures, extended version of the paper presented to
ICMLA'1
GBM-based Bregman Proximal Algorithms for Constrained Learning
As the complexity of learning tasks surges, modern machine learning
encounters a new constrained learning paradigm characterized by more intricate
and data-driven function constraints. Prominent applications include
Neyman-Pearson classification (NPC) and fairness classification, which entail
specific risk constraints that render standard projection-based training
algorithms unsuitable. Gradient boosting machines (GBMs) are among the most
popular algorithms for supervised learning; however, they are generally limited
to unconstrained settings. In this paper, we adapt the GBM for constrained
learning tasks within the framework of Bregman proximal algorithms. We
introduce a new Bregman primal-dual method with a global optimality guarantee
when the learning objective and constraint functions are convex. In cases of
nonconvex functions, we demonstrate how our algorithm remains effective under a
Bregman proximal point framework. Distinct from existing constrained learning
algorithms, ours possess a unique advantage in their ability to seamlessly
integrate with publicly available GBM implementations such as XGBoost (Chen and
Guestrin, 2016) and LightGBM (Ke et al., 2017), exclusively relying on their
public interfaces. We provide substantial experimental evidence to showcase the
effectiveness of the Bregman algorithm framework. While our primary focus is on
NPC and fairness ML, our framework holds significant potential for a broader
range of constrained learning applications. The source code is currently freely
available at
https://github.com/zhenweilin/ConstrainedGBM}{https://github.com/zhenweilin/ConstrainedGBM