1,030 research outputs found
Margin-based Ranking and an Equivalence between AdaBoost and RankBoost
We study boosting algorithms for learning to rank. We give a general margin-based bound for
ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms
that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth
margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm
is a modification of RankBoost, analogous to “approximate coordinate ascent boosting.” Finally,
we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and
classification in terms of their asymptotic behavior on the training set. Under natural conditions,
AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore,
RankBoost, when given a specific intercept, achieves a misclassification error that is as good
as AdaBoost’s. This may help to explain the empirical observations made by Cortes andMohri, and
Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking
algorithm, as measured by the area under the ROC curve
AUC Optimisation and Collaborative Filtering
In recommendation systems, one is interested in the ranking of the predicted
items as opposed to other losses such as the mean squared error. Although a
variety of ways to evaluate rankings exist in the literature, here we focus on
the Area Under the ROC Curve (AUC) as it widely used and has a strong
theoretical underpinning. In practical recommendation, only items at the top of
the ranked list are presented to the users. With this in mind, we propose a
class of objective functions over matrix factorisations which primarily
represent a smooth surrogate for the real AUC, and in a special case we show
how to prioritise the top of the list. The objectives are differentiable and
optimised through a carefully designed stochastic gradient-descent-based
algorithm which scales linearly with the size of the data. In the special case
of square loss we show how to improve computational complexity by leveraging
previously computed measures. To understand theoretically the underlying matrix
factorisation approaches we study both the consistency of the loss functions
with respect to AUC, and generalisation using Rademacher theory. The resulting
generalisation analysis gives strong motivation for the optimisation under
study. Finally, we provide computation results as to the efficacy of the
proposed method using synthetic and real data
Scalable large margin pairwise learning algorithms
2019 Summer.Includes bibliographical references.Classification is a major task in machine learning and data mining applications. Many of these applications involve building a classification model using a large volume of imbalanced data. In such an imbalanced learning scenario, the area under the ROC curve (AUC) has proven to be a reliable performance measure to evaluate a classifier. Therefore, it is desirable to develop scalable learning algorithms that maximize the AUC metric directly. The kernelized AUC maximization machines have established a superior generalization ability compared to linear AUC machines. However, the computational cost of the kernelized machines hinders their scalability. To address this problem, we propose a large-scale nonlinear AUC maximization algorithm that learns a batch linear classifier on approximate feature space computed via the k-means Nyström method. The proposed algorithm is shown empirically to achieve comparable AUC classification performance or even better than the kernel AUC machines, while its training time is faster by several orders of magnitude. However, the computational complexity of the linear batch model compromises its scalability when training sizable datasets. Hence, we develop a second-order online AUC maximization algorithms based on a confidence-weighted model. The proposed algorithms exploit the second-order information to improve the convergence rate and implement a fixed-size buffer to address the multivariate nature of the AUC objective function. We also extend our online linear algorithms to consider an approximate feature map constructed using random Fourier features in an online setting. The results show that our proposed algorithms outperform or are at least comparable to the competing online AUC maximization methods. Despite their scalability, we notice that online first and second-order AUC maximization methods are prone to suboptimal convergence. This can be attributed to the limitation of the hypothesis space. A potential improvement can be attained by learning stochastic online variants. However, the vanilla stochastic methods also suffer from slow convergence because of the high variance introduced by the stochastic process. We address the problem of slow convergence by developing a fast convergence stochastic AUC maximization algorithm. The proposed stochastic algorithm is accelerated using a unique combination of scheduled regularization update and scheduled averaging. The experimental results show that the proposed algorithm performs better than the state-of-the-art online and stochastic AUC maximization methods in terms of AUC classification accuracy. Moreover, we develop a proximal variant of our accelerated stochastic AUC maximization algorithm. The proposed method applies the proximal operator to the hinge loss function. Therefore, it evaluates the gradient of the loss function at the approximated weight vector. Experiments on several benchmark datasets show that our proximal algorithm converges to the optimal solution faster than the previous AUC maximization algorithms
- …