1,967 research outputs found
Using attribute construction to improve the predictability of a GP financial forecasting algorithm
Financial forecasting is an important area in computational finance. EDDIE 8 is an established Genetic Programming financial forecasting algorithm, which has successfully been applied to a number of international datasets. The purpose of this paper is to further increase the algorithm’s predictive performance, by improving its data space representation. In order to achieve this, we use attribute construction to create new (high-level) attributes from the original (low-level) attributes. To examine the effectiveness of the above method, we test the extended EDDIE’s predictive performance across 25 datasets and compare it to the performance of two previous EDDIE algorithms. Results show that the introduction of attribute construction benefits the algorithm, allowing EDDIE to explore the use of new attributes to improve its predictive accuracy
A generic optimising feature extraction method using multiobjective genetic programming
In this paper, we present a generic, optimising feature extraction method using multiobjective genetic programming. We re-examine the feature extraction problem and show that effective feature extraction can significantly enhance the performance of pattern recognition systems with simple classifiers. A framework is presented to evolve optimised feature extractors that transform an input pattern space into a decision space in which maximal class separability is obtained. We have applied this method to real world datasets from the UCI Machine Learning and StatLog databases to verify our approach and compare our proposed method with other reported results. We conclude that our algorithm is able to produce classifiers of superior (or equivalent) performance to the conventional classifiers examined, suggesting removal of the need to exhaustively evaluate a large family of conventional classifiers on any new problem. (C) 2010 Elsevier B.V. All rights reserved
Semantic variation operators for multidimensional genetic programming
Multidimensional genetic programming represents candidate solutions as sets
of programs, and thereby provides an interesting framework for exploiting
building block identification. Towards this goal, we investigate the use of
machine learning as a way to bias which components of programs are promoted,
and propose two semantic operators to choose where useful building blocks are
placed during crossover. A forward stagewise crossover operator we propose
leads to significant improvements on a set of regression problems, and produces
state-of-the-art results in a large benchmark study. We discuss this
architecture and others in terms of their propensity for allowing heuristic
search to utilize information during the evolutionary process. Finally, we look
at the collinearity and complexity of the data representations that result from
these architectures, with a view towards disentangling factors of variation in
application.Comment: 9 pages, 8 figures, GECCO 201
Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics
A good feature representation is a determinant factor to achieve high
performance for many machine learning algorithms in terms of classification.
This is especially true for techniques that do not build complex internal
representations of data (e.g. decision trees, in contrast to deep neural
networks). To transform the feature space, feature construction techniques
build new high-level features from the original ones. Among these techniques,
Genetic Programming is a good candidate to provide interpretable features
required for data analysis in high energy physics. Classically, original
features or higher-level features based on physics first principles are used as
inputs for training. However, physicists would benefit from an automatic and
interpretable feature construction for the classification of particle collision
events.
Our main contribution consists in combining different aspects of Genetic
Programming and applying them to feature construction for experimental physics.
In particular, to be applicable to physics, dimensional consistency is enforced
using grammars.
Results of experiments on three physics datasets show that the constructed
features can bring a significant gain to the classification accuracy. To the
best of our knowledge, it is the first time a method is proposed for
interpretable feature construction with units of measurement, and that experts
in high-energy physics validate the overall approach as well as the
interpretability of the built features.Comment: Accepted in this version to CEC 201
PRZEGLĄD METOD SELEKCJI CECH UŻYWANYCH W DIAGNOSTYCE CZERNIAKA
Currently, a large number of trait selection methods are used. They are becoming more and more of interest among researchers. Some of the methods are of course used more frequently. The article describes the basics of selection-based algorithms. FS methods fall into three categories: filter wrappers, embedded methods. Particular attention was paid to finding examples of applications of the described methods in the diagnosisof skin melanoma.Obecnie stosuje się wiele metod selekcji cech. Cieszą się coraz większym zainteresowaniem badaczy. Oczywiście niektóre metody są stosowane częściej. W artykule zostały opisane podstawy działania algorytmów opartych na selekcji. Metody selekcji cech należące dzielą się na trzy kategorie: metody filtrowe, metody opakowujące, metody wbudowane. Zwrócono szczególnie uwagę na znalezienie przykładów zastosowań opisanych metod w diagnostyce czerniaka skóry
Recommended from our members
Proceedings of ECAI International Workshop on Neural-Symbolic Learning and reasoning NeSy 2006
Feature-based time-series analysis
This work presents an introduction to feature-based time-series analysis. The
time series as a data type is first described, along with an overview of the
interdisciplinary time-series analysis literature. I then summarize the range
of feature-based representations for time series that have been developed to
aid interpretable insights into time-series structure. Particular emphasis is
given to emerging research that facilitates wide comparison of feature-based
representations that allow us to understand the properties of a time-series
dataset that make it suited to a particular feature-based representation or
analysis algorithm. The future of time-series analysis is likely to embrace
approaches that exploit machine learning methods to partially automate human
learning to aid understanding of the complex dynamical patterns in the time
series we measure from the world.Comment: 28 pages, 9 figure
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
- …