24,865 research outputs found

    A generic optimising feature extraction method using multiobjective genetic programming

    Get PDF
    In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved

    Supervised learning with hybrid global optimisation methods

    Get PDF

    PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison

    Full text link
    The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. This work is an important first step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.Comment: 14 pages, 5 figures, submitted for review to JML

    Identification of cellular automata based on incomplete observations with bounded time gaps

    Get PDF
    In this paper, the problem of identifying the cellular automata (CAs) is considered. We frame and solve this problem in the context of incomplete observations, i.e., prerecorded, incomplete configurations of the system at certain, and unknown time stamps. We consider 1-D, deterministic, two-state CAs only. An identification method based on a genetic algorithm with individuals of variable length is proposed. The experimental results show that the proposed method is highly effective. In addition, connections between the dynamical properties of CAs (Lyapunov exponents and behavioral classes) and the performance of the identification algorithm are established and analyzed

    Extracting Boolean rules from CA patterns

    Get PDF
    A multiobjective genetic algorithm (GA) is introduced to identify both the neighborhood and the rule set in the form of a parsimonious Boolean expression for both one- and two-dimensional cellular automata (CA). Simulation results illustrate that the new algorithm performs well even when the patterns are corrupted by static and dynamic nois
    • …
    corecore