137 research outputs found

    Preceding rule induction with instance reduction methods

    Get PDF
    A new prepruning technique for rule induction is presented which applies instance reduction before rule induction. An empirical evaluation records the predictive accuracy and size of rule-sets generated from 24 datasets from the UCI Machine Learning Repository. Three instance reduction algorithms (Edited Nearest Neighbour, AllKnn and DROP5) are compared. Each one is used to reduce the size of the training set, prior to inducing a set of rules using Clark and Boswell's modification of CN2. A hybrid instance reduction algorithm (comprised of AllKnn and DROP5) is also tested. For most of the datasets, pruning the training set using ENN, AllKnn or the hybrid significantly reduces the number of rules generated by CN2, without adversely affecting the predictive performance. The hybrid achieves the highest average predictive accuracy

    Pruning classification rules with instance reduction methods

    Get PDF
    Generating classification rules from data often leads to large sets of rules that need to be pruned. A new pre-pruning technique for rule induction is presented which applies instance reduction before rule induction. Training three rule classifiers on datasets that have been reduced earlier with instance reduction methods leads to a statistically significant lower number of generated rules, without adversely affecting the predictive performance. The search strategies used by the three algorithms vary in terms of both type (depth-first or beam search) and direction (general-to-specific or specific-to-general)

    Empirical Validation of Objective Functions in Feature Selection Based on Acceleration Motion Segmentation Data

    Get PDF
    Recent change in evaluation criteria from accuracy alone to trade-off with time delay has inspired multivariate energy-based approaches in motion segmentation using acceleration. The essence of multivariate approaches lies in the construction of highly dimensional energy and requires feature subset selection in machine learning. Due to fast process, filter methods are preferred; however, their poorer estimate is of the main concerns. This paper aims at empirical validation of three objective functions for filter approaches, Fisher discriminant ratio, multiple correlation (MC), and mutual information (MI), through two subsequent experiments. With respect to 63 possible subsets out of 6 variables for acceleration motion segmentation, three functions in addition to a theoretical measure are compared with two wrappers, k-nearest neighbor and Bayes classifiers in general statistics and strongly relevant variable identification by social network analysis. Then four kinds of new proposed multivariate energy are compared with a conventional univariate approach in terms of accuracy and time delay. Finally it appears that MC and MI are acceptable enough to match the estimate of two wrappers, and multivariate approaches are justified with our analytic procedures
    • …
    corecore