3,794 research outputs found
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
Subgroup Discovery: Real-World Applications
Subgroup discovery is a data mining technique which extracts interesting rules with respect
to a target variable. An important characteristic of this task is the combination of predictive
and descriptive induction. In this paper, an overview about subgroup discovery is performed.
In addition, di erent real-world applications solved through evolutionary algorithms where the
suitability and potential of this type of algorithms for the development of subgroup discovery
algorithms are presented
Multiobjective Evolutionary Induction of Subgroup Discovery Fuzzy Rules: A Case Study in Marketing
This paper presents a multiobjective genetic algorithm which obtains
fuzzy rules for subgroup discovery in disjunctive normal form. This kind of
fuzzy rules lets us represent knowledge about patterns of interest in an
explanatory and understandable form which can be used by the expert. The
evolutionary algorithm follows a multiobjective approach in order to optimize
in a suitable way the different quality measures used in this kind of problems.
Experimental evaluation of the algorithm, applying it to a market problem
studied in the University of Mondragón (Spain), shows the validity of the
proposal. The application of the proposal to this problem allows us to obtain
novel and valuable knowledge for the experts.Spanish Ministry of Science and TechnologyFEDER TIC-2005-08386-C05-01 and TIC-2005-
08386-C05-03TIN2004-20061-E and TIN2004-21343-
Discovering patterns in a survey of secondary injuries due to agricultural assistive technology
The research is motivated by the need for hazard assessment in agriculture field. A small and highly-imbalanced dataset, in which negative instances heavily outnumber positive instances, is derived from a survey of secondary injuries induced by implementation of agriculture assistive technology which assists farmers with injuries or disabilities to continue farm-related work. Three data mining approaches are applied to the imbalanced dataset in order to discover patterns contributing to secondary injuries.
All of patterns discovered by the three approaches are compared according to three evaluation measurements: support, confidence and lift, and potentially most interesting patterns are found. Compared to graphical exploratory analysis which figures out causative factors by evaluating the single effects of attributes on the occurrence of secondary injuries, decision tree algorithm and subgroup discovery algorithms are able to find combinational factors by evaluating the interactive effects of attributes on the occurrence of secondary injuries. Graphical exploratory analysis is able to find patterns with highest support and subgroup discovery algorithms are good at finding high lift patterns.
In addition, the experimental analysis of applying subgroup discovery to our secondary injury dataset demonstrates subgroup discovery method\u27s capability of dealing with imbalanced datasets. Therefore, identifying risk factors contributing to secondary injuries, as well as providing a useful alternative method (subgroup discovery) of dealing with small and highly-imbalanced datasets are important outcomes of this thesis
Mining Characteristic Patterns for Comparative Music Corpus Analysis
A core issue of computational pattern mining is the identification of interesting patterns. When mining music corpora organized into classes of songs, patterns may be of interest because they are characteristic, describing prevalent properties of classes, or because they are discriminant, capturing distinctive properties of classes. Existing work in computational music corpus analysis has focused on discovering discriminant patterns. This paper studies characteristic patterns, investigating the behavior of different pattern interestingness measures in balancing coverage and discriminability of classes in top k pattern mining and in individual top ranked patterns. Characteristic pattern mining is applied to the collection of Native American music by Frances Densmore, and the discovered patterns are shown to be supported by Densmore’s own analyses
SDRDPy: An application to graphically visualize the knowledge obtained with supervised descriptive rule algorithms
SDRDPy is a desktop application that allows experts an intuitive graphic and
tabular representation of the knowledge extracted by any supervised descriptive
rule discovery algorithm. The application is able to provide an analysis of the
data showing the relevant information of the data set and the relationship
between the rules, data and the quality measures associated for each rule
regardless of the tool where algorithm has been executed. All of the
information is presented in a user-friendly application in order to facilitate
expert analysis and also the exportation of reports in different formats
- …