Search CORE

802 research outputs found

Cost-sensitive boosting: A unified approach

Author: Nikolaou Nikolaos
Publication venue
Publication date: 01/08/2017
Field of study

The University of Manchester - Institutional Repository

Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Author: Busa-Fekete Róbert
Kégl Balázs
Éltető Tamás
Szarvas György
Publication venue: Springer
Publication date: 01/01/2013
Field of study

ANR-2010-COSI-002In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, a set of documents retrieved for the same query). The partial ordering is given by relevance labels representing the relevance of documents with respect to the query on an absolute scale. Our approach consists of three simple steps. First, we train standard multi-class classifiers (AdaBoost.MH and multi-class SVM) to discriminate between the relevance labels. Second, the posteriors of multi-class classifiers are calibrated using probabilistic and regression losses in order to estimate the Bayes-scoring function which optimizes the Normalized Discounted Cumulative Gain (NDCG). In the third step, instead of selecting the best multi-class hyperparameters and the best calibration, we mix all the learned models in a simple ensemble scheme. Our extensive experimental study is itself a substantial contribution. We compare most of the existing learning-to-rank techniques on all of the available large-scale benchmark data sets using a standardized implementation of the NDCG score. We show that our approach is competitive with conceptually more complex listwise and pairwise methods, and clearly outperforms them as the data size grows. As a technical contribution, we clarify some of the confusing results related to the ambiguities of the evaluation tools, and propose guidelines for future studies

HAL-IN2P3

Crossref

Publikationer från Linköpings universitet

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Learning from Imbalanced Multi-label Data Sets by Using Ensemble Strategies

Author: Javidi Mohammad Masoud
Shamsezat Fatemeh
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 18/02/2015
Field of study

Multi-label classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Problems of this type are ubiquitous in everyday life. Such as, a movie can be categorized as action, crime, and thriller. Most algorithms on multi-label classification learning are designed for balanced data and donâ€™t work well on imbalanced data. On the other hand, in real applications, most datasets are imbalanced. Therefore, we focused to improve multi-label classification performance on imbalanced datasets. In this paper, a state-of-the-art multi-label classification algorithm, which called IBLR_ML, is employed. This algorithm is produced from combination of k-nearest neighbor and logistic regression algorithms. Logistic regression part of this algorithm is combined with two ensemble learning algorithms, Bagging and Boosting. My approach is called IB-ELR. In this paper, for the first time, the ensemble bagging method whit stable learning as the base learner and imbalanced data sets as the training data is examined. Finally, to evaluate the proposed methods; they are implemented in JAVA language. Experimental results show the effectiveness of proposed methods. Keywords: Multi-label classification, Imbalanced data set, Ensemble learning, Stable algorithm, Logistic regression, Bagging, Boostin

ComEngApp-Journal

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)