1,630 research outputs found

    Margin-based Ranking and an Equivalence between AdaBoost and RankBoost

    Get PDF
    We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to “approximate coordinate ascent boosting.” Finally, we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and classification in terms of their asymptotic behavior on the training set. Under natural conditions, AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore, RankBoost, when given a specific intercept, achieves a misclassification error that is as good as AdaBoost’s. This may help to explain the empirical observations made by Cortes andMohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking algorithm, as measured by the area under the ROC curve

    The Rate of Convergence of AdaBoost

    Get PDF
    The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration tt, the exponential loss of AdaBoost's computed parameter vector will be at most ϵ\epsilon more than that of any parameter vector of ℓ1\ell_1-norm bounded by BB in a number of rounds that is at most a polynomial in BB and 1/ϵ1/\epsilon. We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within C/ϵC/\epsilon iterations, AdaBoost achieves a value of the exponential loss that is at most ϵ\epsilon more than the best possible value, where CC depends on the dataset. We show that this dependence of the rate on ϵ\epsilon is optimal up to constant factors, i.e., at least Ω(1/ϵ)\Omega(1/\epsilon) rounds are necessary to achieve within ϵ\epsilon of the optimal exponential loss.Comment: A preliminary version will appear in COLT 201
    • …
    corecore