Search CORE

12 research outputs found

Discovering a taste for the unusual: exceptional models for preference mining

Author: Alípio Mário Jorge
Arno Knobbe
Carlos Soares
Cláudio Rebelo de Sá
CR Sá de
CR Sá de
E Hüllermeier
F Chiclana
F M Harper
J Chomicki
L Umek
M Leeuwen van
N Jin
N Lavrac
P Brazdil
Paulo Azevedo
PJ Azevedo
V Svendová
W Duivesteijn
WD Cook
WD Cook
Wouter Duivesteijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation Horizon 2020 (2014-2020) under Grant Agreement Number 662189-MANTIS-2014-1

Universidade do Minho: RepositoriUM

Repository TU/e

Crossref

Pure OAI Repository

Leiden University Scholary Publications

Exceptional Preferences Mining

Author: AM Jorge
CR Sá de
E Hüllermeier
H Mannila
J Fürnkranz
J Fürnkranz
L Umek
N Jin
R Agrawal
S Henzgen
S Vembu
T Abudawood
T Van
V Dzyuba
W Duivesteijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Exceptional Preferences Mining (EPM) is a crossover between two subfields of datamining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where the preference relations between subsets of the labels significantly deviate from the norm; a variant of Subgroup Discovery, with rankings as the (complex) target concept. We employ three quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes 'exceptional' varies with the quality measure: the first gauges exceptional overall ranking behavior, the second indicates whether a particular label stands out from the rest, and the third highlights subgroups featuring unusual pairwise label ranking behavior. As proof of concept, we explore five datasets. The results confirm that the new task EPM can deliver interesting knowledge. The results also illustrate how the visualization of the preferences in a Preference Matrix can aid in interpreting exceptional preference subgroups

Crossref

Ghent University Academic Bibliography

University of Twente Research Information

Pattern Mining for Label Ranking

Author: Cláudio Frederico Pinho Rebelo de Sá
Publication venue
Publication date: 16/12/2016
Field of study

Preferences have always been present in many tasks in our daily lives. Buying the right car, choosing a suitable house or even deciding on the food to eat, are trivial examples of decisions that reveal information, explicitly or implicitly, about our preferences. The recent trend of collecting increasing amounts of data is also true for preference data. Extracting and modeling preferences can provide us with invaluable information about the choices of groups or individuals. In areas like e-commerce, which typically deal with decisions from thousands of users, the acquisition of preferences can be a difficult task. For these reasons, artificial intelligence (in particular, machine learning) methods have been increasingly important to the discovery and automatic learning of models about preferences. In this Ph.D. project, several approaches were analyzed and proposed to deal with the LR problem. Most of which has focused on pattern mining methods.Algorithms and the Foundations of Software technolog

Leiden University Scholary Publications

Repositório Aberto da Universidade do Porto

Pattern mining for label ranking

Author: Pinho Rebelo de Sá C.F.
Publication venue
Publication date: 16/12/2016
Field of study

Leiden University Scholary Publications

Preference Learning

Author: Fürnkranz Johannes
Fürnkranz Johannes
Hüllermeier Eyke
Hüllermeier Eyke
Rudin Cynthia
Rudin Cynthia
Sanner Scott
Sanner Scott
Slowinski Roman
Słowiński Roman
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2014
Field of study

This report documents the program and the outcomes of Dagstuhl Seminar 14101 “Preference Learning”. Preferences have recently received considerable attention in disciplines such as machine learning, knowledge discovery, information retrieval, statistics, social choice theory, multiple criteria decision making, decision under risk and uncertainty, operations research, and others. The motivation for this seminar was to showcase recent progress in these different areas with the goal of working towards a common basis of understanding, which should help to facilitate future synergies

Open Access LMU

Mixture-Based Probabilistic Graphical Models for the Label Ranking Problem

Author: Aledo Juan A.
Alfaro Juan C.
González Rodrigo Enrique
Gámez José
Publication venue: 'MDPI AG'
Publication date: 01/03/2021
Field of study

The goal of the Label Ranking (LR) problem is to learn preference models that predict the preferred ranking of class labels for a given unlabeled instance. Different well-known machine learning algorithms have been adapted to deal with the LR problem. In particular, fine-tuned instance-based algorithms (e.g., k-nearest neighbors) and model-based algorithms (e.g., decision trees) have performed remarkably well in tackling the LR problem. Probabilistic Graphical Models (PGMs, e.g., Bayesian networks) have not been considered to deal with this problem because of the difficulty of modeling permutations in that framework. In this paper, we propose a Hidden Naive Bayes classifier (HNB) to cope with the LR problem. By introducing a hidden variable, we can design a hybrid Bayesian network in which several types of distributions can be combined: multinomial for discrete variables, Gaussian for numerical variables, and Mallows for permutations. We consider two kinds of probabilistic models: one based on a Naive Bayes graphical structure (where only univariate probability distributions are estimated for each state of the hidden variable) and another where we allow interactions among the predictive attributes (using a multivariate Gaussian distribution for the parameter estimation). The experimental evaluation shows that our proposals are competitive with the start-of-the-art algorithms in both accuracy and in CPU time requirements

Multidisciplinary Digital Publishing Institute

Universidad de Castilla-La Mancha: Repositorio Universitario Institucional de Recursos Abiertos (RUIdeRA)

Review of Extreme Multilabel Classification

Author: Das Shrutimoy
Dasgupta Arpan
Katyan Siddhant
Kumar Pawan
Publication venue
Publication date: 26/03/2023
Field of study

Extreme multilabel classification or XML, is an active area of interest in machine learning. Compared to traditional multilabel classification, here the number of labels is extremely large, hence, the name extreme multilabel classification. Using classical one versus all classification wont scale in this case due to large number of labels, same is true for any other classifiers. Embedding of labels as well as features into smaller label space is an essential first step. Moreover, other issues include existence of head and tail labels, where tail labels are labels which exist in relatively smaller number of given samples. The existence of tail labels creates issues during embedding. This area has invited application of wide range of approaches ranging from bit compression motivated from compressed sensing, tree based embeddings, deep learning based latent space embedding including using attention weights, linear algebra based embeddings such as SVD, clustering, hashing, to name a few. The community has come up with a useful set of metrics to identify correctly the prediction for head or tail labels.Comment: 46 pages, 13 figure

arXiv.org e-Print Archive

Tackling the supervised label ranking problem by bagging weak learners

Author: Aledo Juan A.
Gámez Jose
Molina-García David
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Preference learning is the branch of machine learning in charge of inducing preference models from data. In this paper we focus on the task known as label ranking problem, whose goal is to predict a ranking among the different labels the class variable can take. Our contribution is twofold: (i) taking as basis the tree-based algorithm LRT described in [1], we design weaker tree-based models which can be learnt more efficiently; and (ii) we show that bagging these weak learners improves not only the LRT algorithm, but also the state-of the-art one (IBLR [1]). Furthermore, the bagging algorithm which takes the weak LRT-based models as base classifiers is competitive in time with respect to LRT and IBLR methods. To check the good ness of our proposal, we conduct a broad experimental study over the standard benchmark used in the label ranking problem literature

Universidad de Castilla-La Mancha: Repositorio Universitario Institucional de Recursos Abiertos (RUIdeRA)

A Survey on Extreme Multi-label Learning

Author: Li Yu-Feng
Mao Zhen
Shi Jiang-Xin
Wei Tong
Zhang Min-Ling
Publication venue
Publication date: 08/10/2022
Field of study

Multi-label learning has attracted significant attention from both academic and industry field in recent decades. Although existing multi-label learning algorithms achieved good performance in various tasks, they implicitly assume the size of target label space is not huge, which can be restrictive for real-world scenarios. Moreover, it is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead. Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed. To fully understand XML, we conduct a survey study in this paper. We first clarify a formal definition for XML from the perspective of supervised learning. Then, based on different model architectures and challenges of the problem, we provide a thorough discussion of the advantages and disadvantages of each category of methods. For the benefit of conducting empirical studies, we collect abundant resources regarding XML, including code implementations, and useful tools. Lastly, we propose possible research directions in XML, such as new evaluation metrics, the tail label problem, and weakly supervised XML.Comment: A preliminary versio

arXiv.org e-Print Archive