4 research outputs found

    Preceding rule induction with instance reduction methods

    Get PDF
    A new prepruning technique for rule induction is presented which applies instance reduction before rule induction. An empirical evaluation records the predictive accuracy and size of rule-sets generated from 24 datasets from the UCI Machine Learning Repository. Three instance reduction algorithms (Edited Nearest Neighbour, AllKnn and DROP5) are compared. Each one is used to reduce the size of the training set, prior to inducing a set of rules using Clark and Boswell's modification of CN2. A hybrid instance reduction algorithm (comprised of AllKnn and DROP5) is also tested. For most of the datasets, pruning the training set using ENN, AllKnn or the hybrid significantly reduces the number of rules generated by CN2, without adversely affecting the predictive performance. The hybrid achieves the highest average predictive accuracy

    Pruning methods for rule induction

    Get PDF
    Machine learning is a research area within computer science that is mainly concerned with discovering regularities in data. Rule induction is a powerful technique used in machine learning wherein the target concept is represented as a set of rules. The attraction of rule induction is that rules are more transparent and easier to understand compared to other induction methods (e.g., regression methods or neural network). Rule induction has been shown to outperform other learners on many problems. However, it is not suitable to handle exceptions and noisy data in training sets, which can be solved by pruning. This thesis is concerned with investigating whether preceding rule induction with instance reduction techniques can help in reducing the complexity of rule sets by reducing the number of rules generated without adversely affecting the predictive accuracy. An empirical study is undertaken to investigate the application of three different rule classifiers to datasets that were previously reduced with promising instance-reduction methods. Furthermore, we propose a new instance reduction method based on Ant Colony Optimization (ACO). We evaluate the effectiveness of this instance reduction method for k nearest neighbour algorithms in term of predictive accuracy and amount of reduction. Then we compared it with other instance reduction methods.We show that pruning classification rules with instance-reduction methods lead to a statistically significant decrease in the number of generated rules, without adversely affecting performance. On the other hand, applying instance-reduction methods enhances the predictive accuracy on some datasets. Moreover, the results provide evidence that: (1) our proposed instance reduction method based on ACO is competitive with the well-known k-NN algorithm; (2) the reduced sets computed by our method offers better classification accuracy than those obtained by the compared algorithms

    Pruning Classification Rules with Reference Vector Selection Methods

    No full text
    Abstract. Attempts to extract logical rules from data often lead to large sets of classification rules that need to be pruned. Training two classifiers, the C4.5 decision tree and the Non-Nested Generalized Exemplars (NNGE) covering algorithm, on datasets that have been reduced earlier with the EkP instance compressor leads to statistically significantly lower number of derived rules with nonsignificant degradation of results. Similar results have been observed with other popular instance filters used for data pruning. Numerical experiments presented here illustrate that it is possible to extract more interesting and simpler sets of rules from filtered datasets. This enables a better understanding of knowledge structures when data is explored using algorithms that tend to induce a large number of classification rules.