3,794 research outputs found

    Learning Interpretable Rules for Multi-label Classification

    Full text link
    Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

    Subgroup Discovery: Real-World Applications

    Get PDF
    Subgroup discovery is a data mining technique which extracts interesting rules with respect to a target variable. An important characteristic of this task is the combination of predictive and descriptive induction. In this paper, an overview about subgroup discovery is performed. In addition, di erent real-world applications solved through evolutionary algorithms where the suitability and potential of this type of algorithms for the development of subgroup discovery algorithms are presented

    Multiobjective Evolutionary Induction of Subgroup Discovery Fuzzy Rules: A Case Study in Marketing

    Get PDF
    This paper presents a multiobjective genetic algorithm which obtains fuzzy rules for subgroup discovery in disjunctive normal form. This kind of fuzzy rules lets us represent knowledge about patterns of interest in an explanatory and understandable form which can be used by the expert. The evolutionary algorithm follows a multiobjective approach in order to optimize in a suitable way the different quality measures used in this kind of problems. Experimental evaluation of the algorithm, applying it to a market problem studied in the University of Mondragón (Spain), shows the validity of the proposal. The application of the proposal to this problem allows us to obtain novel and valuable knowledge for the experts.Spanish Ministry of Science and TechnologyFEDER TIC-2005-08386-C05-01 and TIC-2005- 08386-C05-03TIN2004-20061-E and TIN2004-21343-

    Discovering patterns in a survey of secondary injuries due to agricultural assistive technology

    Get PDF
    The research is motivated by the need for hazard assessment in agriculture field. A small and highly-imbalanced dataset, in which negative instances heavily outnumber positive instances, is derived from a survey of secondary injuries induced by implementation of agriculture assistive technology which assists farmers with injuries or disabilities to continue farm-related work. Three data mining approaches are applied to the imbalanced dataset in order to discover patterns contributing to secondary injuries. All of patterns discovered by the three approaches are compared according to three evaluation measurements: support, confidence and lift, and potentially most interesting patterns are found. Compared to graphical exploratory analysis which figures out causative factors by evaluating the single effects of attributes on the occurrence of secondary injuries, decision tree algorithm and subgroup discovery algorithms are able to find combinational factors by evaluating the interactive effects of attributes on the occurrence of secondary injuries. Graphical exploratory analysis is able to find patterns with highest support and subgroup discovery algorithms are good at finding high lift patterns. In addition, the experimental analysis of applying subgroup discovery to our secondary injury dataset demonstrates subgroup discovery method\u27s capability of dealing with imbalanced datasets. Therefore, identifying risk factors contributing to secondary injuries, as well as providing a useful alternative method (subgroup discovery) of dealing with small and highly-imbalanced datasets are important outcomes of this thesis

    Mining Characteristic Patterns for Comparative Music Corpus Analysis

    Get PDF
    A core issue of computational pattern mining is the identification of interesting patterns. When mining music corpora organized into classes of songs, patterns may be of interest because they are characteristic, describing prevalent properties of classes, or because they are discriminant, capturing distinctive properties of classes. Existing work in computational music corpus analysis has focused on discovering discriminant patterns. This paper studies characteristic patterns, investigating the behavior of different pattern interestingness measures in balancing coverage and discriminability of classes in top k pattern mining and in individual top ranked patterns. Characteristic pattern mining is applied to the collection of Native American music by Frances Densmore, and the discovered patterns are shown to be supported by Densmore’s own analyses

    SDRDPy: An application to graphically visualize the knowledge obtained with supervised descriptive rule algorithms

    Full text link
    SDRDPy is a desktop application that allows experts an intuitive graphic and tabular representation of the knowledge extracted by any supervised descriptive rule discovery algorithm. The application is able to provide an analysis of the data showing the relevant information of the data set and the relationship between the rules, data and the quality measures associated for each rule regardless of the tool where algorithm has been executed. All of the information is presented in a user-friendly application in order to facilitate expert analysis and also the exportation of reports in different formats
    corecore