Search CORE

55,707 research outputs found

Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

Author: Freitas Alex A.
Otero Fernando E.B.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2016
Field of study

The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly-used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines and the cAnt-MinerPB producing ordered rules are also presented

Crossref

Kent Academic Repository

Recommended from our members

Improving music genre classification using automatically induced harmony rules

Author: Anglade A.
Benetos E.
Dixon S.
Mauch M.
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2010
Field of study

We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates

City Research Online

Crossref

Recommended from our members

Improving music genre classification using automatically induced harmony rules

Author: Amélie Anglade
Aucouturier J.-J.
Cataltepe Z.
Emmanouil Benetos
Fukunaga K.
Lawson C. L.
Matthias Mauch
Piston W.
Pérez-Sancho C.
Quinlan J. R.
Schölkopf B.
Simon Dixon
Tzanetakis G.
van der Hedjen F.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

City Research Online

Crossref

Ghent University Academic Bibliography

University of Miami: Scholarship Miami

The University of Manchester - Institutional Repository

Radboud Repository

QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules

Author: Kliegr Tomas
Publication venue
Publication date: 18/10/2019
Field of study

The need to prediscretize numeric attributes before they can be used in association rule learning is a source of inefficiencies in the resulting classifier. This paper describes several new rule tuning steps aiming to recover information lost in the discretization of numeric (quantitative) attributes, and a new rule pruning strategy, which further reduces the size of the classification models. We demonstrate the effectiveness of the proposed methods on postoptimization of models generated by three state-of-the-art association rule classification algorithms: Classification based on Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016), and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from the UCI repository show that the postoptimized models are consistently smaller -- typically by about 50% -- and have better classification performance on most datasets

arXiv.org e-Print Archive

Highly Relevant Routing Recommendation Systems for Handling Few Data Using MDL Principle and Embedded Relevance Boosting Factors

Author: Prasetya I. S. W. B.
Puspitaningrum Diyah
Wicaksono P. A.
Publication venue
Publication date: 18/04/2018
Field of study

A route recommendation system can provide better recommendation if it also takes collected user reviews into account, e.g. places that generally get positive reviews may be preferred. However, to classify sentiment, many classification algorithms existing today suffer in handling small data items such as short written reviews. In this paper we propose a model for a strongly relevant route recommendation system that is based on an MDL-based (Minimum Description Length) sentiment classification and show that such a system is capable of handling small data items (short user reviews). Another highlight of the model is the inclusion of a set of boosting factors in the relevance calculation to improve the relevance in any recommendation system that implements the model.Comment: ACM SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval (LND4IR'18), July 12, 2018, Ann Arbor, Michigan, USA, 8 pages, 9 figure

arXiv.org e-Print Archive

Utrecht University Repository

A Shapelet Transform for Time Series Classification

Author: Bagnall A
Davis L
Hills J
Lines J
Publication venue
Publication date: 14/08/2012
Field of study

University of East Anglia digital repository

GENESIM : genetic extraction of a single, interpretable model

Author: De Turck Filip
Janssens Olivier
Ongenae Femke
Van Hoecke Sofie
Vandewiele Gilles
Publication venue
Publication date: 01/01/2016
Field of study

Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, instead of decision making, is crucial. To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees to a single decision tree with an enhanced predictive performance by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction and ensemble techniques using twelve publicly available data sets. The results show that GENESIM achieves a better predictive performance on most of these data sets than decision tree induction techniques and a predictive performance in the same order of magnitude as the ensemble techniques. Moreover, the resulting model of GENESIM has a very low complexity, making it very interpretable, in contrast to ensemble techniques.Comment: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex System

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Archivsystem Ask23

An Experimental Evaluation of Nearest Neighbour Time Series Classification

Author: Bagnall Anthony
Lines Jason
Publication venue
Publication date: 01/01/2014
Field of study

Data mining research into time series classification (TSC) has focussed on alternative distance measures for nearest neighbour classifiers. It is standard practice to use 1-NN with Euclidean or dynamic time warping (DTW) distance as a straw man for comparison. As part of a wider investigation into elastic distance measures for TSC~\cite{lines14elastic}, we perform a series of experiments to test whether this standard practice is valid. Specifically, we compare 1-NN classifiers with Euclidean and DTW distance to standard classifiers, examine whether the performance of 1-NN Euclidean approaches that of 1-NN DTW as the number of cases increases, assess whether there is any benefit of setting

k

for

k

-NN through cross validation whether it is worth setting the warping path for DTW through cross validation and finally is it better to use a window or weighting for DTW. Based on experiments on 77 problems, we conclude that 1-NN with Euclidean distance is fairly easy to beat but 1-NN with DTW is not, if window size is set through cross validation

arXiv.org e-Print Archive

CiteSeerX

University of East Anglia digital repository