Search CORE

9 research outputs found

Subgroup Discovery with Proper Scoring Rules

Author: Flach Peter
Kalogridis Georgios
Kull Meelis
Song Hao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Explore Bristol Research

Subgroup Discovery: Real-World Applications

Author: Carmona C. J.
Elizondo David
Publication venue: Techincal Report
Publication date: 01/03/2011
Field of study

Subgroup discovery is a data mining technique which extracts interesting rules with respect to a target variable. An important characteristic of this task is the combination of predictive and descriptive induction. In this paper, an overview about subgroup discovery is performed. In addition, di erent real-world applications solved through evolutionary algorithms where the suitability and potential of this type of algorithms for the development of subgroup discovery algorithms are presented

De Montfort University Open Research Archive

Discovering a taste for the unusual: exceptional models for preference mining

Author: Alípio Mário Jorge
Arno Knobbe
Carlos Soares
Cláudio Rebelo de Sá
CR Sá de
CR Sá de
E Hüllermeier
F Chiclana
F M Harper
J Chomicki
L Umek
M Leeuwen van
N Jin
N Lavrac
P Brazdil
Paulo Azevedo
PJ Azevedo
V Svendová
W Duivesteijn
WD Cook
WD Cook
Wouter Duivesteijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation Horizon 2020 (2014-2020) under Grant Agreement Number 662189-MANTIS-2014-1

Universidade do Minho: RepositoriUM

Repository TU/e

Crossref

Pure OAI Repository

Leiden University Scholary Publications

Exceptional Preferences Mining

Author: AM Jorge
CR Sá de
E Hüllermeier
H Mannila
J Fürnkranz
J Fürnkranz
L Umek
N Jin
R Agrawal
S Henzgen
S Vembu
T Abudawood
T Van
V Dzyuba
W Duivesteijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Exceptional Preferences Mining (EPM) is a crossover between two subfields of datamining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where the preference relations between subsets of the labels significantly deviate from the norm; a variant of Subgroup Discovery, with rankings as the (complex) target concept. We employ three quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes 'exceptional' varies with the quality measure: the first gauges exceptional overall ranking behavior, the second indicates whether a particular label stands out from the rest, and the third highlights subgroups featuring unusual pairwise label ranking behavior. As proof of concept, we explore five datasets. The results confirm that the new task EPM can deliver interesting knowledge. The results also illustrate how the visualization of the preferences in a Preference Matrix can aid in interpreting exceptional preference subgroups

Crossref

Ghent University Academic Bibliography

University of Twente Research Information

Improving Predictions of Multiple Binary Models in ILP

Author: Tarek Abudawood
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Despite the success of ILP systems in learning first-order rules from small number of examples and complexly structured data in various domains, they struggle in dealing with multiclass problems. In most cases they boil down a multiclass problem into multiple black-box binary problems following the one-versus-one or one-versus-rest binarisation techniques and learn a theory for each one. When evaluating the learned theories of multiple class problems in one-versus-rest paradigm particularly, there is a bias caused by the default rule toward the negative classes leading to an unrealistic high performance beside the lack of prediction integrity between the theories. Here we discuss the problem of using one-versus-rest binarisation technique when it comes to evaluating multiclass data and propose several methods to remedy this problem. We also illustrate the methods and highlight their link to binary tree and Formal Concept Analysis (FCA). Our methods allow learning of a simple, consistent, and reliable multiclass theory by combining the rules of the multiple one-versus-rest theories into one rule list or rule set theory. Empirical evaluation over a number of data sets shows that our proposed methods produce coherent and accurate rule models from the rules learned by the ILP system of Aleph

Crossref

Directory of Open Access Journals

Diverse subgroup set discovery

Author: A Knobbe
A Mitchell-Jones
Arno Knobbe
G Garriga
G Webb
H Grosskreutz
H Heikinheimo
H Peng
J Friedman
J Han
J Vreeken
M Leeuwen van
Matthijs van Leeuwen
N Lavrač
P Clark
P Grünwald
P Kralj Novak
S Bay
S Kullback
T Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Anytime Discovery of a Diverse Set of Patterns with Monte Carlo Tree Search

Author: Bosc Guillaume
Boulicaut Jean-François
Kaytoue Mehdi
Raïssi Chedy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International audienceThe discovery of patterns that accurately discriminate one class label from another remains a challenging data mining task. Subgroup discovery (SD) is one of the frameworks that enables to elicit such interesting patterns from labeled data. A question remains fairly open: How to select an accurate heuristic search technique when exhaustive enumeration of the pattern space is infeasible? Existing approaches make use of beam-search, sampling, and genetic algorithms for discovering a pattern set that is non-redundant and of high quality w.r.t. a pattern quality measure. We argue that such approaches produce pattern sets that lack of diversity: Only few patterns of high quality, and different enough, are discovered. Our main contribution is then to formally define pattern mining as a game and to solve it with Monte Carlo tree search (MCTS). It can be seen as an exhaustive search guided by random simulations which can be stopped early (limited budget) by virtue of its best-first search property. We show through a comprehensive set of experiments how MCTS enables the anytime discovery of a diverse pattern set of high quality. It out-performs other approaches when dealing with a large pattern search space and for different quality measures. Thanks to its genericity, our MCTS approach can be used for SD but also for many other pattern mining tasks

INRIA a CCSD electronic archive server

Evaluation Measures for Multi-class Subgroup Discovery

Author: B. Kijsirikul
C.W. Hsu
F. Fleuret
H. Bostrom
I.H. Witten
J. Demšar
J. Fürnkranz
J.C. Platt
J.H. Friedman
N. Lavrač
N. Lavrač
P. Clark
P. Clark
R.E. Schapire
W. Klösgen
W. Klösgen
X. Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Explore Bristol Research