29 research outputs found

    Briefly Noted

    No full text

    Estimating Performance Gains for Voted Decision Trees

    No full text
    Decision tree induction is a prominent learning method, typically yielding quick results with competitive predictive performance. However, it is not unusual to find other automated learning methods that exceed the predictive performance of a decision tree on the same application. Toachieve near-optimal classification results, resampling techniques can be employed to generate multiple decision-tree solutions. These decision trees are individually applied and their answers voted. The potential for exceptionally strong performance is counterbalanced by the substantial increase in computing time to induce many decision trees. We describe estimators of predictive performance for voted decision trees induced from bootstrap (bagged) or adaptive (boosted) resampling. The estimates are found by examining the performance of a single tree and its pruned subtrees over a single, training set and a large test set. Using publicly available collections of data, we show that these estimate..

    ABSTRACT Solving Regression Problems with Rule-based Ensemble Classifiers

    No full text
    We describe a lightweight learning method that induces an ensemble of decision-rule solutions for regression problems. Instead of direct prediction of a continuous output variable, the method discretizes the variable by k-means clustering and solves the resultant classification problem. Predictions on new examples are made by averaging the mean values of classes with votes that are close in number to the most likely class. We provide experimental evidence that this in-direct approach can often yield strong results for many ap-plications, generally outperforming direct approaches such as regression trees and rivaling bagged regression trees

    An empirical comparison of pattern recognition, neural nets, and machine learning classification methods

    No full text
    Classification methods from statistical pattern recognition, neural nets, and machine learning were applied to four real-world data sets. Each of these data sets has been previously analyzed and reported in the statistical, medical, or machine learning literature. The data sets are characterized by statisucal uncertainty; there is no completely accurate solution to these problems. Training and testing or resampling techniques are used to estimate the true error rates of the classification methods. Detailed attention is given to the analysis of performance of the neural nets using back propagation. For these problems, which have relatively few hypotheses and features, the machine learning procedures for rule induction or tree induction clearly performed best.

    Reduced complexity rule induction

    No full text
    We present an architecture for rule induction that emphasizes compact, reduced-complexity rules. A new heuristic technique for finding a covering rule set of sample data is described. This technique refines a set of production rules by iteratively replacing a component of a rule with its single best replacement. A method for rule induction has been developed that combines this covering and refinement scheme with other techniques known to help reduce the complexity of rule sets, such as weakest-link pruning, resampling, and the judicious use of linear discriminants. Published results on several real-world datasets are reviewed where decision trees have performed relatively poorly. It is shown that far simpler decision rules can be found with predictive performance that exceeds those previously reported for various learning models, including neural nets and decision trees.

    Predictive data mining: a practical guide

    No full text
    corecore