40,135 research outputs found

    A generic optimising feature extraction method using multiobjective genetic programming

    Get PDF
    In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved

    Random Prism: An Alternative to Random Forests.

    Get PDF
    Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting

    Human gesture classification by brute-force machine learning for exergaming in physiotherapy

    Get PDF
    In this paper, a novel approach for human gesture classification on skeletal data is proposed for the application of exergaming in physiotherapy. Unlike existing methods, we propose to use a general classifier like Random Forests to recognize dynamic gestures. The temporal dimension is handled afterwards by majority voting in a sliding window over the consecutive predictions of the classifier. The gestures can have partially similar postures, such that the classifier will decide on the dissimilar postures. This brute-force classification strategy is permitted, because dynamic human gestures show sufficient dissimilar postures. Online continuous human gesture recognition can classify dynamic gestures in an early stage, which is a crucial advantage when controlling a game by automatic gesture recognition. Also, ground truth can be easily obtained, since all postures in a gesture get the same label, without any discretization into consecutive postures. This way, new gestures can be easily added, which is advantageous in adaptive game development. We evaluate our strategy by a leave-one-subject-out cross-validation on a self-captured stealth game gesture dataset and the publicly available Microsoft Research Cambridge-12 Kinect (MSRC-12) dataset. On the first dataset we achieve an excellent accuracy rate of 96.72%. Furthermore, we show that Random Forests perform better than Support Vector Machines. On the second dataset we achieve an accuracy rate of 98.37%, which is on average 3.57% better then existing methods

    A Decision tree-based attribute weighting filter for naive Bayes

    Get PDF
    The naive Bayes classifier continues to be a popular learning algorithm for data mining applications due to its simplicity and linear run-time. Many enhancements to the basic algorithm have been proposed to help mitigate its primary weakness--the assumption that attributes are independent given the class. All of them improve the performance of naïve Bayes at the expense (to a greater or lesser degree) of execution time and/or simplicity of the final model. In this paper we present a simple filter method for setting attribute weights for use with naive Bayes. Experimental results show that naive Bayes with attribute weights rarely degrades the quality of the model compared to standard naive Bayes and, in many cases, improves it dramatically. The main advantages of this method compared to other approaches for improving naive Bayes is its run-time complexity and the fact that it maintains the simplicity of the final model

    On The Stability of Interpretable Models

    Full text link
    Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process. Bias in data collection and preparation, or in model's construction may severely affect the accountability of the design process. We conduct an experimental study of the stability of interpretable models with respect to feature selection, instance selection, and model selection. Our conclusions should raise awareness and attention of the scientific community on the need of a stability impact assessment of interpretable models

    Separation of pulsar signals from noise with supervised machine learning algorithms

    Full text link
    We evaluate the performance of four different machine learning (ML) algorithms: an Artificial Neural Network Multi-Layer Perceptron (ANN MLP ), Adaboost, Gradient Boosting Classifier (GBC), XGBoost, for the separation of pulsars from radio frequency interference (RFI) and other sources of noise, using a dataset obtained from the post-processing of a pulsar search pi peline. This dataset was previously used for cross-validation of the SPINN-based machine learning engine, used for the reprocessing of HTRU-S survey data arXiv:1406.3627. We have used Synthetic Minority Over-sampling Technique (SMOTE) to deal with high class imbalance in the dataset. We report a variety of quality scores from all four of these algorithms on both the non-SMOTE and SMOTE datasets. For all the above ML methods, we report high accuracy and G-mean in both the non-SMOTE and SMOTE cases. We study the feature importances using Adaboost, GBC, and XGBoost and also from the minimum Redundancy Maximum Relevance approach to report algorithm-agnostic feature ranking. From these methods, we find that the signal to noise of the folded profile to be the best feature. We find that all the ML algorithms report FPRs about an order of magnitude lower than the corresponding FPRs obtained in arXiv:1406.3627, for the same recall value.Comment: 14 pages, 2 figures. Accepted for publication in Astronomy and Computin
    corecore