8 research outputs found

    Simplification of genetic programs: a literature survey

    Get PDF
    Genetic programming (GP), a widely used evolutionary computing technique, suffers from bloat—the problem of excessive growth in individuals’ sizes. As a result, its ability to efficiently explore complex search spaces reduces. The resulting solutions are less robust and generalisable. Moreover, it is difficult to understand and explain models which contain bloat. This phenomenon is well researched, primarily from the angle of controlling bloat: instead, our focus in this paper is to review the literature from an explainability point of view, by looking at how simplification can make GP models more explainable by reducing their sizes. Simplification is a code editing technique whose primary purpose is to make GP models more explainable. However, it can offer bloat control as an additional benefit when implemented and applied with caution. Researchers have proposed several simplification techniques and adopted various strategies to implement them. We organise the literature along multiple axes to identify the relative strengths and weaknesses of simplification techniques and to identify emerging trends and areas for future exploration. We highlight design and integration challenges and propose several avenues for research. One of them is to consider simplification as a standalone operator, rather than an extension of the standard crossover or mutation operators. Its role is then more clearly complementary to other GP operators, and it can be integrated as an optional feature into an existing GP setup. Another proposed avenue is to explore the lack of utilisation of complexity measures in simplification. So far, size is the most discussed measure, with only two pieces of prior work pointing out the benefits of using time as a measure when controlling bloat

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Programação genética aplicada à identificação de acidentes de uma usina nuclear PWR

    Get PDF
    This work presentes the results of the study that evaluated the efficiency of the evolutionary computation algorithm genetic programming as a technique for the optimization and feature generation at a pattern recognition system for the diagnostic of accidents in a pressurized water reactor nuclear power plant. The foundations of a typical pattern recognition system, the state of the art of genetic programming and of similar accident/transient diagnosis systems at nuclear power plants are also presented. Considering the set of the time evolution of seventeen operational variables for the three accident scenarios approached, plus normal condition, the task of genetic programming was to evolve non-linear regressors with combination of those variables that would provide the most discriminatory information for each of the events. After exhaustive tests with plenty of variable associations, genetic programming was proven to be a methodology capable of attaining success rates of, or very close to, 100%, with quite simple parametrization of the algorithm and at very reasonable time, putting itself in levels of performance similar or even superior as other similar systems available in the scientific literature, while also having the additional advantage of requiring very little pretreatment (sometimes none at all) of the dataNeste trabalho são apresentados os resultados do estudo que avaliou a performance do algoritmo de computação evolucionária programação genética como ferramenta de otimização e geração de atributos em um sistema de reconhecimento de padrões para identificação e diagnóstico de acidentes de uma usina nuclear com reator de água pressurizada. São apresentados ainda as bases de um sistema de reconhecimento de padrões, o estado da arte da programação genética e de sistemas similares de diagnóstico de acidentes e transientes de usinas nucleares. Dentro do conjunto da evolução temporal de 17 variáveis operacionais dos três acidentes/transientes considerado, além da condição normal, a função da programação genética foi evoluir regressores não lineares de combinações dessas variáveis que fornecessem o máximo de informação discriminatória para cada um dos eventos. Após testes exaustivos com diversas associações de variáveis, a programação genética se mostrou uma metodologia capaz de fornecer taxas de acerto de, ou muito próximas de, 100%, com parametrizações do algoritmo relativamente simples e em tempo de treinamento bastante razoável, mostrando ser capaz de fornecer resultados compatíveis e até superiores a outros sistemas disponíveis na literatura, com a vantagem adicional de requerer pouco (e muitas vezes nenhum) pré-tratamento nos dados

    Field Guide to Genetic Programming

    Get PDF

    Marcação das partes do discurso usando computação evolucionária

    Get PDF
    A marcação das partes do discurso constitui uma tarefa de considerável importância na área de processamento de língua natural. O seu objectivo consiste em marcar automaticamente as palavras de um texto com etiquetas que designam as partes do discurso adequadas. A abordagem proposta nesta tese divide o problema em duas tarefas: uma de aprendizagem e outra de optimização. Foram adoptados algoritmos da área da computação evolucionária em cada uma das fases. Destacamos a utilização de inteligência de enxame, não só pelos bons resultados alcançados, mas também por se revelar uma das primeiras aplicações deste tipo de algoritmos a este problema. A abordagem foi pensada com o objectivo de poder ser alargada a outras tarefas de processamento de língua natural, com características comuns à da marcação das partes do discurso. Os resultados obtidos em corpora em língua Inglesa e Portuguesa encontram-se entre os melhores publicados; ABSTRACT: Part-of-speech tagging is a task of considerable importance in the field of natural language processing. Its purpose is to automatically tag the words of a text with labels that designate the appropriate parts-of-speech. The approach proposed in this thesis divides the problem into two tasks: a learning task and an optimization task. Algorithms from the field of evolutionary computing were adopted to tackle each of those tasks. We emphasize the use of swarm intelligence, not only for the good results achieved, but also because it is one of the first applications of such algorithms to this problem. This approach was designed with the aim of being easily extended to other natural language processing tasks that share characteristics with the part-of-speech tagging problem. The results obtained in English and Portuguese language corpora are among the best published

    Genetic Programming for Natural Language Parsing

    No full text

    Multiobjective genetic programming for natural language parsing and tagging

    No full text
    Abstract. Parsing and Tagging are very important tasks in Natural Language Processing. Parsing amounts to searching the correct combination of grammatical rules among those compatible with a given sentence. Tagging amounts to labeling each word in a sentence with its lexical category and, because many words belong to more than one lexical class, it turns out to be a disambiguation task. Because parsing and tagging are related tasks, its simultaneous resolution can improve the results of both of them. This work aims developing a multiobjective genetic program to perform simultaneously statistical parsing and tagging. It combines the statistical data about grammar rules and about tag sequences to guide the search of the best structure. Results show that any of the implemented multiobjective optimization models improve on the results obtained in the resolution of each problem separately.
    corecore