1,720 research outputs found
Repeated sequences in linear genetic programming genomes
Biological chromosomes are replete with repetitive sequences, micro
satellites, SSR tracts, ALU, etc. in their DNA base sequences. We
started looking for similar phenomena in evolutionary computation.
First studies find copious repeated sequences, which can be hierarchically
decomposed into shorter sequences, in programs evolved using
both homologous and two point crossover but not with headless chicken
crossover or other mutations. In bloated programs the small number
of effective or expressed instructions appear in both repeated and nonrepeated
code. Hinting that building-blocks or code reuse may evolve
in unplanned ways.
Mackey-Glass chaotic time series prediction and eukaryotic protein
localisation (both previously used as artificial intelligence machine
learning benchmarks) demonstrate evolution of Shannon information
(entropy) and lead to models capable of lossy Kolmogorov compression.
Our findings with diverse benchmarks and GP systems suggest
this emergent phenomenon may be widespread in genetic systems
A Field Guide to Genetic Programming
xiv, 233 p. : il. ; 23 cm.Libro ElectrĂłnicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction --
Representation, initialisation and operators in Tree-based GP --
Getting ready to run genetic programming --
Example genetic programming run --
Alternative initialisations and operators in Tree-based GP --
Modular, grammatical and developmental Tree-based GP --
Linear and graph genetic programming --
Probalistic genetic programming --
Multi-objective genetic programming --
Fast and distributed genetic programming --
GP theory and its applications --
Applications --
Troubleshooting GP --
Conclusions.Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell
1.2 Getting Started
1.3 Prerequisites
1.4 Overview of this Field Guide I
Basics
2 Representation, Initialisation and GP
2.1 Representation
2.2 Initialising the Population
2.3 Selection
2.4 Recombination and Mutation Operators in Tree-based
3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set 19
3.2 Step 2: Function Set 20
3.2.1 Closure 21
3.2.2 Sufficiency 23
3.2.3 Evolving Structures other than Programs 23
3.3 Step 3: Fitness Function 24
3.4 Step 4: GP Parameters 26
3.5 Step 5: Termination and solution designation 27
4 Example Genetic Programming Run
4.1 Preparatory Steps 29
4.2 Step-by-Step Sample Run 31
4.2.1 Initialisation 31
4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming
5 Alternative Initialisations and Operators in
5.1 Constructing the Initial Population
5.1.1 Uniform Initialisation
5.1.2 Initialisation may Affect Bloat
5.1.3 Seeding
5.2 GP Mutation
5.2.1 Is Mutation Necessary?
5.2.2 Mutation Cookbook
5.3 GP Crossover
5.4 Other Techniques 32
5.5 Tree-based GP 39
6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures 47
6.1.1 Automatically Defined Functions 48
6.1.2 Program Architecture and Architecture-Altering 50
6.2 Constraining Structures 51
6.2.1 Enforcing Particular Structures 52
6.2.2 Strongly Typed GP 52
6.2.3 Grammar-based Constraints 53
6.2.4 Constraints and Bias 55
6.3 Developmental Genetic Programming 57
6.4 Strongly Typed Autoconstructive GP with PushGP 59
7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming 61
7.1.1 Motivations 61
7.1.2 Linear GP Representations 62
7.1.3 Linear GP Operators 64
7.2 Graph-Based Genetic Programming 65
7.2.1 Parallel Distributed GP (PDGP) 65
7.2.2 PADO 67
7.2.3 Cartesian GP 67
7.2.4 Evolving Parallel Programs using Indirect Encodings 68
8 Probabilistic Genetic Programming
8.1 Estimation of Distribution Algorithms 69
8.2 Pure EDA GP 71
8.3 Mixing Grammars and Probabilities 74
9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate 76
9.2.1 Multi-objective Bloat and Complexity Control 77
9.2.2 Other Objectives 78
9.2.3 Non-Pareto Criteria 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias 81
10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83
10.2 Reducing Cost of Fitness with Caches 86
10.3 Parallel and Distributed GP are Not Equivalent 88
10.4 Running GP on Parallel Hardware 89
10.4.1 Masterâslave GP 89
10.4.2 GP Running on GPUs 90
10.4.3 GP on FPGAs 92
10.4.4 Sub-machine-code GP 93
10.5 Geographically Distributed GP 93
11 GP Theory and its Applications 97
11.1 Mathematical Models 98
11.2 Search Spaces 99
11.3 Bloat 101
11.3.1 Bloat in Theory 101
11.3.2 Bloat Control in Practice 104
III
Practical Genetic Programming
12 Applications
12.1 Where GP has Done Well
12.2 Curve Fitting, Data Modelling and Symbolic Regression
12.3 Human Competitive Results â the Humies
12.4 Image and Signal Processing
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control
12.7 Medicine, Biology and Bioinformatics
12.8 GP to Create Searchers and Solvers â Hyper-heuristics xiii
12.9 Entertainment and Computer Games 127
12.10The Arts 127
12.11Compression 128
13 Troubleshooting GP
13.1 Is there a Bug in the Code?
13.2 Can you Trust your Results?
13.3 There are No Silver Bullets
13.4 Small Changes can have Big Effects
13.5 Big Changes can have No Effect
13.6 Study your Populations
13.7 Encourage Diversity
13.8 Embrace Approximation
13.9 Control Bloat
13.10 Checkpoint Results
13.11 Report Well
13.12 Convince your Customers
14 Conclusions
Tricks of the Trade
A Resources
A.1 Key Books
A.2 Key Journals
A.3 Key International Meetings
A.4 GP Implementations
A.5 On-Line Resources 145
B TinyGP 151
B.1 Overview of TinyGP 151
B.2 Input Data Files for TinyGP 153
B.3 Source Code 154
B.4 Compiling and Running TinyGP 162
Bibliography 167
Inde
An Overview of Schema Theory
The purpose of this paper is to give an introduction to the field of Schema
Theory written by a mathematician and for mathematicians. In particular, we
endeavor to to highlight areas of the field which might be of interest to a
mathematician, to point out some related open problems, and to suggest some
large-scale projects. Schema theory seeks to give a theoretical justification
for the efficacy of the field of genetic algorithms, so readers who have
studied genetic algorithms stand to gain the most from this paper. However,
nothing beyond basic probability theory is assumed of the reader, and for this
reason we write in a fairly informal style.
Because the mathematics behind the theorems in schema theory is relatively
elementary, we focus more on the motivation and philosophy. Many of these
results have been proven elsewhere, so this paper is designed to serve a
primarily expository role. We attempt to cast known results in a new light,
which makes the suggested future directions natural. This involves devoting a
substantial amount of time to the history of the field.
We hope that this exposition will entice some mathematicians to do research
in this area, that it will serve as a road map for researchers new to the
field, and that it will help explain how schema theory developed. Furthermore,
we hope that the results collected in this document will serve as a useful
reference. Finally, as far as the author knows, the questions raised in the
final section are new.Comment: 27 pages. Originally written in 2009 and hosted on my website, I've
decided to put it on the arXiv as a more permanent home. The paper is
primarily expository, so I don't really know where to submit it, but perhaps
one day I will find an appropriate journa
A Neat Approach To Genetic Programming
The evolution of explicitly represented topologies such as graphs involves devising methods for mutating, comparing and combining structures in meaningful ways and identifying and maintaining the necessary topological diversity. Research has been conducted in the area of the evolution of trees in genetic programming and of neural networks and some of these problems have been addressed independently by the different research communities. In the domain of neural networks, NEAT (Neuroevolution of Augmenting Topologies) has shown to be a successful method for evolving increasingly complex networks. This system\u27s success is based on three interrelated elements: speciation, marking of historical information in topologies, and initializing search in a small structures search space. This provides the dynamics necessary for the exploration of diverse solution spaces at once and a way to discriminate between different structures. Although different representations have emerged in the area of genetic programming, the study of the tree representation has remained of interest in great part because of its mapping to programming languages and also because of the observed phenomenon of unnecessary code growth or bloat which hinders performance. The structural similarity between trees and neural networks poses an interesting question: Is it possible to apply the techniques from NEAT to the evolution of trees and if so, how does it affect performance and the dynamics of code growth? In this work we address these questions and present analogous techniques to those in NEAT for genetic programming
Network intrusion detection using genetic programming.
Masters Degree. University of KwaZulu-Natal, Pietermaritzburg.Network intrusion detection is a real-world problem that involves detecting intrusions on a computer network. Detecting whether a network connection is intrusive or non-intrusive is essentially a binary classification problem. However, the type of intrusive connections can be categorised into a number of network attack classes and the task of associating an intrusion to a particular network type is multiclass classification.
A number of artificial intelligence techniques have been used for network intrusion detection including Evolutionary Algorithms. This thesis investigates the application of evolutionary algorithms namely, Genetic Programming (GP), Grammatical Evolution (GE) and Multi-Expression Programming (MEP) in the network intrusion detection domain. Grammatical evolution and multi-expression programming are considered to be variants of GP. In this thesis, a comparison of the effectiveness of classifiers evolved by the three EAs within the network intrusion detection domain is performed. The comparison is performed on the publicly available KDD99 dataset. Furthermore, the effectiveness of a number of fitness functions is evaluated.
From the results obtained, standard genetic programming performs better than grammatical evolution and multi-expression programming. The findings indicate that binary classifiers evolved using standard genetic programming outperformed classifiers evolved using grammatical evolution and multi-expression programming. For evolving multiclass classifiers different fitness functions used produced classifiers with different characteristics resulting in some classifiers achieving higher detection rates for specific network intrusion attacks as compared to other intrusion attacks. The findings indicate that classifiers evolved using multi-expression programming and genetic programming achieved high detection rates as compared to classifiers evolved using grammatical evolution
Recommended from our members
XML-based genetic rules for scene boundary detection in a parallel processing environment
Genetic programming is based on Darwinian evolutionary theory that suggests that the best solution for a problem can be evolved by methods of natural selection of the fittest organisms in a population. These principles are translated into genetic programming by populating the solution space with an initial number of computer programs that can possibly solve the problem and then evolving the programs by means of mutation, reproduction and crossover until a candidate solution can be found that is close to or is the optimal solution for the problem. The computer programs are not fully formed source code but rather a derivative that is represented as a parse tree. The initial solutions are randomly generated and set to a certain population size that the system can compute efficiently. Research has shown that better solutions can be obtained if 1) the population size is increased and 2) if multiple runs are performed of each experiment. If multiple runs are initiated on many machines the probability of finding an optimal solution are increased exponentially and computed more efficiently. With the proliferation of the web and high speed bandwidth connections genetic programming can take advantage of grid computing to both increase population size and increasing the number of runs by utilising machines connected to the web. Using XML-Schema as a global referencing mechanism for defining the parameters and syntax of the evolvable computer programs all machines can synchronise ad-hoc to the ever changing environment of the solution space. Another advantage of using XML is that rules are constructed that can be transformed by XSLT or DOM tree viewers so they can be understood by the GP programmer. This allows the programmer to experiment by manipulating rules to increase the fitness of a rule and evaluate the selection of parameters used to define a solution
- âŠ