682 research outputs found

    Bounding Bloat in Genetic Programming

    Full text link
    While many optimization problems work with a fixed number of decision variables and thus a fixed-length representation of possible solutions, genetic programming (GP) works on variable-length representations. A naturally occurring problem is that of bloat (unnecessary growth of solutions) slowing down optimization. Theoretical analyses could so far not bound bloat and required explicit assumptions on the magnitude of bloat. In this paper we analyze bloat in mutation-based genetic programming for the two test functions ORDER and MAJORITY. We overcome previous assumptions on the magnitude of bloat and give matching or close-to-matching upper and lower bounds for the expected optimization time. In particular, we show that the (1+1) GP takes (i) Θ(Tinit+nlogn)\Theta(T_{init} + n \log n) iterations with bloat control on ORDER as well as MAJORITY; and (ii) O(TinitlogTinit+n(logn)3)O(T_{init} \log T_{init} + n (\log n)^3) and Ω(Tinit+nlogn)\Omega(T_{init} + n \log n) (and Ω(TinitlogTinit)\Omega(T_{init} \log T_{init}) for n=1n=1) iterations without bloat control on MAJORITY.Comment: An extended abstract has been published at GECCO 201

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Simplification of genetic programs: a literature survey

    Get PDF
    Genetic programming (GP), a widely used evolutionary computing technique, suffers from bloat—the problem of excessive growth in individuals’ sizes. As a result, its ability to efficiently explore complex search spaces reduces. The resulting solutions are less robust and generalisable. Moreover, it is difficult to understand and explain models which contain bloat. This phenomenon is well researched, primarily from the angle of controlling bloat: instead, our focus in this paper is to review the literature from an explainability point of view, by looking at how simplification can make GP models more explainable by reducing their sizes. Simplification is a code editing technique whose primary purpose is to make GP models more explainable. However, it can offer bloat control as an additional benefit when implemented and applied with caution. Researchers have proposed several simplification techniques and adopted various strategies to implement them. We organise the literature along multiple axes to identify the relative strengths and weaknesses of simplification techniques and to identify emerging trends and areas for future exploration. We highlight design and integration challenges and propose several avenues for research. One of them is to consider simplification as a standalone operator, rather than an extension of the standard crossover or mutation operators. Its role is then more clearly complementary to other GP operators, and it can be integrated as an optional feature into an existing GP setup. Another proposed avenue is to explore the lack of utilisation of complexity measures in simplification. So far, size is the most discussed measure, with only two pieces of prior work pointing out the benefits of using time as a measure when controlling bloat

    Application of multiobjective genetic programming to the design of robot failure recognition systems

    Get PDF
    We present an evolutionary approach using multiobjective genetic programming (MOGP) to derive optimal feature extraction preprocessing stages for robot failure detection. This data-driven machine learning method is compared both with conventional (nonevolutionary) classifiers and a set of domain-dependent feature extraction methods. We conclude MOGP is an effective and practical design method for failure recognition systems with enhanced recognition accuracy over conventional classifiers, independent of domain knowledge

    It is Time for New Perspectives on How to Fight Bloat in GP

    Full text link
    The present and future of evolutionary algorithms depends on the proper use of modern parallel and distributed computing infrastructures. Although still sequential approaches dominate the landscape, available multi-core, many-core and distributed systems will make users and researchers to more frequently deploy parallel version of the algorithms. In such a scenario, new possibilities arise regarding the time saved when parallel evaluation of individuals are performed. And this time saving is particularly relevant in Genetic Programming. This paper studies how evaluation time influences not only time to solution in parallel/distributed systems, but may also affect size evolution of individuals in the population, and eventually will reduce the bloat phenomenon GP features. This paper considers time and space as two sides of a single coin when devising a more natural method for fighting bloat. This new perspective allows us to understand that new methods for bloat control can be derived, and the first of such a method is described and tested. Experimental data confirms the strength of the approach: using computing time as a measure of individuals' complexity allows to control the growth in size of genetic programming individuals

    On-the-fly simplification of genetic programming models

    Get PDF
    The last decade has seen amazing performance improvements in deep learning. However, the black-box nature of this approach makes it difficult to provide explanations of the generated models. In some fields such as psychology and neuroscience, this limitation in explainability and interpretability is an important issue. Approaches such as genetic programming are well positioned to take the lead in these fields because of their inherent white box nature. Genetic programming, inspired by Darwinian theory of evolution, is a population-based search technique capable of exploring a highdimensional search space intelligently and discovering multiple solutions. However, it is prone to generate very large solutions, a phenomenon often called “bloat”. The bloated solutions are not easily understandable. In this paper, we propose two techniques for simplifying the generated models. Both techniques are tested by generating models for a well-known psychology experiment. The validity of these techniques is further tested by applying them to a symbolic regression problem. Several population dynamics are studied to make sure that these techniques are not compromising diversity – an important measure for finding better solutions. The results indicate that the two techniques can be both applied independently and simultaneously and that they are capable of finding solutions at par with those generated by the standard GP algorithm – but with significantly reduced program size. There was no loss in diversity nor reduction in overall fitness. In fact, in some experiments, the two techniques even improved fitness

    Automatic synthesis of sorting algorithms by gene expression programming + (geometric) semantic gene expression programming + encouraging phenotype variation with a new semantic operator: semantic conditional crossover

    Get PDF
    Gene Expression Programming (GEP) is an alternative to Genetic Programming (GP). Given its characteristics compared to GP, we question if GEP should be the standard choice for evolutionary program synthesis, both as base for research and practical application. We raise the question if such a shift could increase the rate of investigation, applicability and the quality of results obtained from evolutionary techniques for code optimization. We present three distinct and unprecedented studies using GEP in an attempt to develop understanding, investigate the potential and forward the branch. Each study has an individual contribution on its own involving GEP. As a whole, the three studies try to investigate di erent aspects that might be critical to answer the questions raised in the previous paragraph. In the rst individual contribution, we investigate GEP's applicability to automatically synthesize sorting algorithms. Performance is compared against GP under similar experimental conditions. GEP is shown to be capable of producing sorting algorithms and outperforms GP in doing so. As a second experiment, we enhanced GEP's evolutionary process with semantic awareness of candidate programs, originating Semantic Gene Expression Programming (SGEP), similarly to how Semantic Genetic Programming (SGP) builds over GP. Geometric semantic concepts are then introduced to SGEP, forming Geometric Semantic Gene Expression Programming (GSGEP). A comparative experiment between GP, GEP, SGP and SGEP is performed using di erent problems and setup combinations. Results were mixed when comparing SGEP and SGP, suggesting performance is signi cantly related to the problem addressed. By out-performing the alternatives in many of the benchmarks, SGEP demonstrates practical potential. The results are analyzed in di erent perspectives, also providing insight on the potential of di erent crossover variations when applied along GP/GEP. GEP' compatibility with innovation developed to work with GP is demonstrated possible without extensive adaptation. Considerations for integration of SGEP are discussed. In the last contribution, a new semantic operator is proposed, SCC, which applies crossover conditionally only when elements are semantically di erent enough, performing mutation otherwise. The strategy attempts to encourage semantic diversity and wider the portion of the semantic-solution space searched. A practical experiment was performed alternating the integration of SCC in the evolutionary process. When using the operator, the quality of obtained solutions alternated between slight improvements and declines. The results don't show a relevant indication of possible advantage from its employment and don't con rm what was expected in the theory. We discuss ways in which further work might investigate this concept and assess if it has practical potential under di erent circumstances. On the other hand, in regards to the basilar questions of this investigation, the process of development and testing of SCC is performed completely on a GEP/SGEP base, suggesting how the latest can be used as the base for future research on evolutionary program synthesis.Programa c~ao Gen etica por Express~oes (GEP) e uma alternativa recente a Programa c~ao Gen etica (GP). Neste estudo observamos o GEP e colocamos a quest~ao se este n~ao deveria ser tratado como primeira escolha quando se trata de sintetiza c~ao autom atica de programas atrav es de m etodos evolutivos. Dadas as caracteristicas do GEP perguntamonos se esta mudan ca de perspectiva poderia aumentar a investiga c~ao, aplicabilidade e qualidade dos resultados obtidos para a optimiza c~ao de c odigo por m etodos evolutivos. Neste estudo apresentamos tr^es contribui c~oes in editas e distintas usando o algoritmo GEP. Cada uma das contribui c~oes apresenta um avan co ou investiga c~ao no campo da GEP. Como um todo, estas contribui c~oes tentam obter cohecimento e informa c~oes para se abordar a quest~ao geral apresentada no p aragrafo anterior. Na primeira contribui c~ao, investiga-mos e testamos o GEP no problema da sintese autom atica de algoritmos de ordena c~ao. Para o melhor do nosso conhecimento, esta e a primeira vez que este problema e abordado com o GEP. A performance e comparada a do GP em condi c~oes semelhantes, de modo a isolar as caracteristicas de cada algoritmo como factor de distin c~ao. As a second experiment, we enhanced GEP's evolutionary process with semantic awareness of candidate programs, originating Semantic Gene Expression Programming (SGEP), similarly to how Semantic Genetic Programming (SGP) builds over GP. Geometric semantic concepts are then introduced to SGEP, forming Geometric Semantic Gene Expression Programming (GSGEP). A comparative experiment between GP, GEP, SGP and SGEP is performed using di erent problems and setup combinations. Results were mixed when comparing SGEP and SGP, suggesting performance is signi cantly related to the problem addressed. By out-performing the alternatives in many of the benchmarks, SGEP demonstrates practical potential. The results are analyzed in di erent perspectives, also providing insight on the potential of di erent crossover variations when applied along GP/GEP. GEP's compatibility with innovation developed to work with GP is demonstrated possible without extensive adaptation. Considerations for integration of SGEP are discussed. Na segunda contribui c~ao, adicionamos ao processo evolutivo do GEP a capacidade de medir o valor sem^antico dos programas que constituem a popula c~ao. A esta variante damos o nome de Programa c~ao Gen etica por Express~oes Sem^antica (SGEP). Esta variante tr as para o GEP as mesmas caracteristicas que a Programa c~ao Gen etica Sem^antica(SGP) trouxe para o GP convencional. Conceitos geom etricos s~ao tamb em apresentados para o SGEP, extendendo assim a variante e criando a Programa c~ao Gen etica por Express~oes Geom etrica Sem^antica (GSGEP). De forma a testar estas novas variantes, efectuamos uma experi^encia onde s~ao comparados o GP, GEP, SGP e SGEP entre diferentes problemas e combina c~oes de operadores de cruzamento. Os resultados mostraram que n~ao houve um algoritmo que se destaca-se em todas as experi^encias, sugerindo que a performance est a signi cativamente relacionada com o problema a ser abordado. De qualquer modo, o SGEP obteve vantagem em bastantes dos benchmarks, dando assim ind cios de pot^encial ter utilidade pr atica. De um modo geral, esta contribui c~ao demonstra que e possivel utilizar tecnologia desenvolvida a pensar em GP no GEP sem grande esfor co na adapta c~ao. No m da contribui c~ao, s~ao discutidas algumas considera c~oes sobre o SGEP. Na terceira contribui c~ao propomos um novo operador, o Cruzamento Sem^antico Condicional (SCC). Este operador, baseado na dist^ancia sem^antica entre dois elementos propostos, decide se os elementos s~ao propostos para cruzamento, ou se um deles e mutato e ambos re-introduzidos na popula c~ao. Esta estrat egia tem como objectivo aumentar a diversidade gen etica na popula c~ao em fases cruciais do processo evolutivo e alargar a por c~ao do espa co sem^antico pesquisado. Para avaliar o pot^encial deste operador, realizamos uma experi^encia pr atica e comparamos processos evolutivos semelhantes onde o uso ou n~ao uso do SCC e o factor de distin c~ao. Os resultados obtidos n~ao demonstraram vantagens no uso do SCC e n~ao con rmam o esperado em teoria. No entanto s~ao discutidas maneiras em que o conceito pode ser reaproveitado para novos testes em que possa ter pot^encial para demonstrar resultados possitivos. Em rela c~ao a quest~ao central da tese, visto este estudo ter sido desenvolvido com base em GEP/SGEP e visto a teoria do SCC ser compativel com GP, e demonstrado que um estudo geral a area da sintese de algoritmos por meios evolutivos, pode ser conduzido com base no GEP
    corecore