35 research outputs found
PhysicsGP: A Genetic Programming Approach to Event Selection
We present a novel multivariate classification technique based on Genetic
Programming. The technique is distinct from Genetic Algorithms and offers
several advantages compared to Neural Networks and Support Vector Machines. The
technique optimizes a set of human-readable classifiers with respect to some
user-defined performance measure. We calculate the Vapnik-Chervonenkis
dimension of this class of learning machines and consider a practical example:
the search for the Standard Model Higgs Boson at the LHC. The resulting
classifier is very fast to evaluate, human-readable, and easily portable. The
software may be downloaded at: http://cern.ch/~cranmer/PhysicsGP.htmlComment: 16 pages 9 figures, 1 table. Submitted to Comput. Phys. Commu
CSM-464: On the Limiting Distribution of Program Sizes in Tree-based Genetic Programming
We provide strong theoretical and experimental evidence that standard sub-tree crossover with uniform selection of crossover points pushes a population of a-ary GP trees towards a distribution of tree sizes of the form: [see document for formula] where n is the number of internal nodes in a tree and pa is a constant. This result generalises the result previously reported in [7, 10, 8, 9] for the case a = 1
Modelling Medical Time Series Using Grammar-Guided Genetic Programming
The analysis of time series is extremely important in the field of medicine, because this is the format of many medical data types. Most of the approaches that address this problem are based on numerical algorithms that calculate distances, clusters, reference models, etc. However, a symbolic rather than numerical analysis is sometimes needed to search for the characteristics of time series. Symbolic information helps users to efficiently analyse and compare time series in the same or in a similar way as a domain expert would. This paper describes the definition of the symbolic domain, the process of converting numerical into symbolic time series and a distance for comparing symbolic temporal sequences. Then, the paper focuses on a method to create the symbolic reference model for a certain population using grammar-guided genetic programming. The work is applied to the isokinetics domain within an application called I4
Online Diversity Control in Symbolic Regression via a Fast Hash-based Tree Similarity Measure
Diversity represents an important aspect of genetic programming, being
directly correlated with search performance. When considered at the genotype
level, diversity often requires expensive tree distance measures which have a
negative impact on the algorithm's runtime performance. In this work we
introduce a fast, hash-based tree distance measure to massively speed-up the
calculation of population diversity during the algorithmic run. We combine this
measure with the standard GA and the NSGA-II genetic algorithms to steer the
search towards higher diversity. We validate the approach on a collection of
benchmark problems for symbolic regression where our method consistently
outperforms the standard GA as well as NSGA-II configurations with different
secondary objectives.Comment: 8 pages, conference, submitted to congress on evolutionary
computatio
Semantic Building Blocks in Genetic Programming
In this paper we present a new mechanism for studying the impact of subtree crossover in terms of semantic building blocks. This approach allows us to completely and compactly describe the semantic action of crossover, and provide insight into what does (or doesn’t) make crossover effective. Our results make it clear that a very high proportion of crossover events (typically over 75% in our experiments) are guaranteed to perform no immediately useful search in the semantic space. Our findings also indicate a strong correlation between lack of progress and high proportions of fixed contexts. These results then suggest several new, theoretically grounded, research areas