1,030 research outputs found

    Margin-based Ranking and an Equivalence between AdaBoost and RankBoost

    Get PDF
    We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to “approximate coordinate ascent boosting.” Finally, we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and classification in terms of their asymptotic behavior on the training set. Under natural conditions, AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore, RankBoost, when given a specific intercept, achieves a misclassification error that is as good as AdaBoost’s. This may help to explain the empirical observations made by Cortes andMohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking algorithm, as measured by the area under the ROC curve

    AUC Optimisation and Collaborative Filtering

    Get PDF
    In recommendation systems, one is interested in the ranking of the predicted items as opposed to other losses such as the mean squared error. Although a variety of ways to evaluate rankings exist in the literature, here we focus on the Area Under the ROC Curve (AUC) as it widely used and has a strong theoretical underpinning. In practical recommendation, only items at the top of the ranked list are presented to the users. With this in mind, we propose a class of objective functions over matrix factorisations which primarily represent a smooth surrogate for the real AUC, and in a special case we show how to prioritise the top of the list. The objectives are differentiable and optimised through a carefully designed stochastic gradient-descent-based algorithm which scales linearly with the size of the data. In the special case of square loss we show how to improve computational complexity by leveraging previously computed measures. To understand theoretically the underlying matrix factorisation approaches we study both the consistency of the loss functions with respect to AUC, and generalisation using Rademacher theory. The resulting generalisation analysis gives strong motivation for the optimisation under study. Finally, we provide computation results as to the efficacy of the proposed method using synthetic and real data

    Scalable large margin pairwise learning algorithms

    Get PDF
    2019 Summer.Includes bibliographical references.Classification is a major task in machine learning and data mining applications. Many of these applications involve building a classification model using a large volume of imbalanced data. In such an imbalanced learning scenario, the area under the ROC curve (AUC) has proven to be a reliable performance measure to evaluate a classifier. Therefore, it is desirable to develop scalable learning algorithms that maximize the AUC metric directly. The kernelized AUC maximization machines have established a superior generalization ability compared to linear AUC machines. However, the computational cost of the kernelized machines hinders their scalability. To address this problem, we propose a large-scale nonlinear AUC maximization algorithm that learns a batch linear classifier on approximate feature space computed via the k-means Nyström method. The proposed algorithm is shown empirically to achieve comparable AUC classification performance or even better than the kernel AUC machines, while its training time is faster by several orders of magnitude. However, the computational complexity of the linear batch model compromises its scalability when training sizable datasets. Hence, we develop a second-order online AUC maximization algorithms based on a confidence-weighted model. The proposed algorithms exploit the second-order information to improve the convergence rate and implement a fixed-size buffer to address the multivariate nature of the AUC objective function. We also extend our online linear algorithms to consider an approximate feature map constructed using random Fourier features in an online setting. The results show that our proposed algorithms outperform or are at least comparable to the competing online AUC maximization methods. Despite their scalability, we notice that online first and second-order AUC maximization methods are prone to suboptimal convergence. This can be attributed to the limitation of the hypothesis space. A potential improvement can be attained by learning stochastic online variants. However, the vanilla stochastic methods also suffer from slow convergence because of the high variance introduced by the stochastic process. We address the problem of slow convergence by developing a fast convergence stochastic AUC maximization algorithm. The proposed stochastic algorithm is accelerated using a unique combination of scheduled regularization update and scheduled averaging. The experimental results show that the proposed algorithm performs better than the state-of-the-art online and stochastic AUC maximization methods in terms of AUC classification accuracy. Moreover, we develop a proximal variant of our accelerated stochastic AUC maximization algorithm. The proposed method applies the proximal operator to the hinge loss function. Therefore, it evaluates the gradient of the loss function at the approximated weight vector. Experiments on several benchmark datasets show that our proximal algorithm converges to the optimal solution faster than the previous AUC maximization algorithms
    • …
    corecore