168 research outputs found
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
We are interested in supervised ranking algorithms that perform especially well near the top of the
ranked list, and are only required to perform sufficiently well on the rest of the list. In this work,
we provide a general form of convex objective that gives high-scoring examples more importance.
This “push” near the top of the list can be chosen arbitrarily large or small, based on the preference
of the user. We choose â„“p-norms to provide a specific type of push; if the user sets p larger, the
objective concentrates harder on the top of the list. We derive a generalization bound based on
the p-norm objective, working around the natural asymmetry of the problem. We then derive a
boosting-style algorithm for the problem of ranking with a push at the top. The usefulness of the
algorithm is illustrated through experiments on repository data. We prove that the minimizer of the
algorithm’s objective is unique in a specific sense. Furthermore, we illustrate how our objective is
related to quality measurements for information retrieval
A Primal-Dual Convergence Analysis of Boosting
Boosting combines weak learners into a predictor with low empirical risk. Its
dual constructs a high entropy distribution upon which weak learners and
training labels are uncorrelated. This manuscript studies this primal-dual
relationship under a broad family of losses, including the exponential loss of
AdaBoost and the logistic loss, revealing:
- Weak learnability aids the whole loss family: for any {\epsilon}>0,
O(ln(1/{\epsilon})) iterations suffice to produce a predictor with empirical
risk {\epsilon}-close to the infimum;
- The circumstances granting the existence of an empirical risk minimizer may
be characterized in terms of the primal and dual problems, yielding a new proof
of the known rate O(ln(1/{\epsilon}));
- Arbitrary instances may be decomposed into the above two, granting rate
O(1/{\epsilon}), with a matching lower bound provided for the logistic loss.Comment: 40 pages, 8 figures; the NIPS 2011 submission "The Fast Convergence
of Boosting" is a brief presentation of the primary results; compared with
the JMLR version, this arXiv version has hyperref and some formatting tweak
- …