812 research outputs found

    Controlling individuals growth in semantic genetic programming through elitist replacement

    Get PDF
    Castelli, M., Vanneschi, L., & PopoviÄŤ, A. (2016). Controlling individuals growth in semantic genetic programming through elitist replacement. Computational Intelligence And Neuroscience, 2016, [8326760]. https://doi.org/10.1155/2016/8326760In 2012, Moraglio and coauthors introduced new genetic operators for Genetic Programming, called geometric semantic genetic operators. They have the very interesting advantage of inducing a unimodal error surface for any supervised learning problem. At the same time, they have the important drawback of generating very large data models that are usually very hard to understand and interpret. The objective of this work is to alleviate this drawback, still maintaining the advantage. More in particular, we propose an elitist version of geometric semantic operators, in which offspring are accepted in the new population only if they have better fitness than their parents. We present experimental evidence, on five complex real-life test problems, that this simple idea allows us to obtain results of a comparable quality (in terms of fitness), but with much smaller data models, compared to the standard geometric semantic operators. In the final part of the paper, we also explain the reason why we consider this a significant improvement, showing that the proposed elitist operators generate manageable models, while the models generated by the standard operators are so large in size that they can be considered unmanageable.publishersversionpublishe

    Investigating the Use of Geometric Semantic Operators in Vectorial Genetic Programming

    Get PDF
    Azzali, I., Vanneschi, L., & Giacobini, M. (2020). Investigating the Use of Geometric Semantic Operators in Vectorial Genetic Programming. In T. Hu, N. Lourenço, E. Medvet, & F. Divina (Eds.), Genetic Programming - 23rd European Conference, EuroGP 2020, Held as Part of EvoStar 2020, Proceedings (pp. 52-67). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12101 LNCS). Springer. https://doi.org/10.1007/978-3-030-44094-7_4 ------- This work was partially supported by FCT, Portugal through funding of LASIGE Research Unit (UID/CEC/00408/2019), and projects PREDICT (PTDC/CCI-IF/29877/2017), BINDER (PTDC/CCI-INF/29168/2017), GADgET (DSAIPA/DS/0022/2018) and AICE (DSAIPA/DS/0113/2019).Vectorial Genetic Programming (VE_GP) is a new GP approach for panel data forecasting. Besides permitting the use of vectors as terminal symbols to represent time series and including aggregation functions to extract time series features, it introduces the possibility of evolving the window of aggregation. The local aggregation of data allows the identification of meaningful patterns overcoming the drawback of considering always the previous history of a series of data. In this work, we investigate the use of geometric semantic operators (GSOs) in VE_GP, comparing its performance with traditional GP with GSOs. Experiments are conducted on two real panel data forecasting problems, one allowing the aggregation on moving windows, one not. Results show that classical VE_GP is the best approach in both cases in terms of predictive accuracy, suggesting that GSOs are not able to evolve efficiently individuals when time series are involved. We discuss the possible reasons of this behaviour, to understand how we could design valuable GSOs for time series in the future.authorsversionpublishe

    A multi-population hybrid Genetic Programming System

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn the last few years, geometric semantic genetic programming has incremented its popularity, obtaining interesting results on several real life applications. Nevertheless, the large size of the solutions generated by geometric semantic genetic programming is still an issue, in particular for those applications in which reading and interpreting the final solution is desirable. In this thesis, a new parallel and distributed genetic programming system is introduced with the objective of mitigating this drawback. The proposed system (called MPHGP, which stands for Multi-Population Hybrid Genetic Programming) is composed by two types of subpopulations, one of which runs geometric semantic genetic programming, while the other runs a standard multi-objective genetic programming algorithm that optimizes, at the same time, fitness and size of solutions. The two subpopulations evolve independently and in parallel, exchanging individuals at prefixed synchronization instants. The presented experimental results, obtained on five real-life symbolic regression applications, suggest that MPHGP is able to find solutions that are comparable, or even better, than the ones found by geometric semantic genetic programming, both on training and on unseen testing data. At the same time, MPHGP is also able to find solutions that are significantly smaller than the ones found by geometric semantic genetic programming

    An Analysis of Diversity in Genetic Programming

    Get PDF
    Genetic programming is a metaheuristic search method that uses a population of variable-length computer programs and a search strategy based on biological evolution. The idea of automatic programming has long been a goal of artificial intelligence, and genetic programming presents an intuitive method for automatically evolving programs. However, this method is not without some potential drawbacks. Search using procedural representations can be complex and inefficient. In addition, variable sized solutions can become unnecessarily large and difficult to interpret. The goal of this thesis is to understand the dynamics of genetic programming that encourages efficient and effective search. Toward this goal, the research focuses on an important property of genetic programming search: the population. The population is related to many key aspects of the genetic programming algorithm. In this programme of research, diversity is used to describe and analyse populations and their effect on search. A series of empirical investigations are carried out to better understand the genetic programming algorithm. The research begins by studying the relationship between diversity and search. The effect of increased population diversity and a metaphor of search are then examined. This is followed by an investigation into the phenomenon of increased solution size and problem difficulty. The research concludes by examining the role of diverse individuals, particularly the ability of diverse individuals to affect the search process and ways of improving the genetic programming algorithm. This thesis makes the following contributions: (1) An analysis shows the complexity of the issues of diversity and the relationship between diversity and fitness, (2) The genetic programming search process is characterised by using the concept of genetic lineages and the sampling of structures and behaviours, (3) A causal model of the varied rates of solution size increase is presented, (4) A new, tunable problem demonstrates the contribution of different population members during search, and (5) An island model is proposed to improve the search by speciating dissimilar individuals into better-suited environments. Currently, genetic programming is applied to a wide range of problems under many varied contexts. From artificial intelligence to operations research, the results presented in this thesis will benefit population-based search methods, methods based on the concepts of evolution and search methods using variable-length representations

    Nonlinear Dynamic System Identification and Model Predictive Control Using Genetic Programming

    Get PDF
    During the last century, a lot of developments have been made in research of complex nonlinear process control. As a powerful control methodology, model predictive control (MPC) has been extensively applied to chemical industrial applications. Core to MPC is a predictive model of the dynamics of the system being controlled. Most practical systems exhibit complex nonlinear dynamics, which imposes big challenges in system modelling. Being able to automatically evolve both model structure and numeric parameters, Genetic Programming (GP) shows great potential in identifying nonlinear dynamic systems. This thesis is devoted to GP based system identification and model-based control of nonlinear systems. To improve the generalization ability of GP models, a series of experiments that use semantic-based local search within a multiobjective GP framework are reported. The influence of various ways of selecting target subtrees for local search as well as different methods for performing that search were investigated; a comparison with the Random Desired Operator (RDO) of Pawlak et al. was made by statistical hypothesis testing. Compared with the corresponding baseline GP algorithms, models produced by a standard steady state or generational GP followed by a carefully-designed single-objective GP implementing semantic-based local search are statistically more accurate and with smaller (or equal) tree size, compared with the RDO-based GP algorithms. Considering the practical application, how to correctly and efficiently apply an evolved GP model to other larger systems is a critical research concern. Currently, the replication of GP models is normally done by repeating other’s work given the necessary algorithm parameters. However, due to the empirical and stochastic nature of GP, it is difficult to completely reproduce research findings. An XML-based standard file format, named Genetic Programming Markup Language (GPML), is proposed for the interchange of GP trees. A formal definition of this standard and details of implementation are described. GPML provides convenience and modularity for further applications based on GP models. The large-scale adoption of MPC in buildings is not economically viable due to the time and cost involved in designing and adjusting predictive models by expert control engineers. A GP-based control framework is proposed for automatically evolving dynamic nonlinear models for the MPC of buildings. An open-loop system identification was conducted using the data generated by a building simulator, and the obtained GP model was then employed to construct the predictive model for the MPC. The experimental result shows GP is able to produce models that allow the MPC of building to achieve the desired temperature band in a single zone space

    Evolving Graphs by Graph Programming

    Get PDF
    Graphs are a ubiquitous data structure in computer science and can be used to represent solutions to difficult problems in many distinct domains. This motivates the use of Evolutionary Algorithms to search over graphs and efficiently find approximate solutions. However, existing techniques often represent and manipulate graphs in an ad-hoc manner. In contrast, rule-based graph programming offers a formal mechanism for describing relations over graphs. This thesis proposes the use of rule-based graph programming for representing and implementing genetic operators over graphs. We present the Evolutionary Algorithm Evolving Graphs by Graph Programming and a number of its extensions which are capable of learning stateful and stateless digital circuits, symbolic expressions and Artificial Neural Networks. We demonstrate that rule-based graph programming may be used to implement new and effective constraint-respecting mutation operators and show that these operators may strictly generalise others found in the literature. Through our proposal of Semantic Neutral Drift, we accelerate the search process by building plateaus into the fitness landscape using domain knowledge of equivalence. We also present Horizontal Gene Transfer, a mechanism whereby graphs may be passively recombined without disrupting their fitness. Through rigorous evaluation and analysis of over 20,000 independent executions of Evolutionary Algorithms, we establish numerous benefits of our approach. We find that on many problems, Evolving Graphs by Graph Programming and its variants may significantly outperform other approaches from the literature. Additionally, our empirical results provide further evidence that neutral drift aids the efficiency of evolutionary search

    From examples to knowledge in model-driven engineering : a holistic and pragmatic approach

    Full text link
    Le Model-Driven Engineering (MDE) est une approche de développement logiciel qui propose d’élever le niveau d’abstraction des langages afin de déplacer l’effort de conception et de compréhension depuis le point de vue des programmeurs vers celui des décideurs du logiciel. Cependant, la manipulation de ces représentations abstraites, ou modèles, est devenue tellement complexe que les moyens traditionnels ne suffisent plus à automatiser les différentes tâches. De son côté, le Search-Based Software Engineering (SBSE) propose de reformuler l’automatisation des tâches du MDE comme des problèmes d’optimisation. Une fois reformulé, la résolution du problème sera effectuée par des algorithmes métaheuristiques. Face à la pléthore d’études sur le sujet, le pouvoir d’automatisation du SBSE n’est plus à démontrer. C’est en s’appuyant sur ce constat que la communauté du Example-Based MDE (EBMDE) a commencé à utiliser des exemples d’application pour alimenter la reformulation SBSE du problème d’apprentissage de tâche MDE. Dans ce contexte, la concordance de la sortie des solutions avec les exemples devient un baromètre efficace pour évaluer l’aptitude d’une solution à résoudre une tâche. Cette mesure a prouvé être un objectif sémantique de choix pour guider la recherche métaheuristique de solutions. Cependant, s’il est communément admis que la représentativité des exemples a un impact sur la généralisabilité des solutions, l'étude de cet impact souffre d’un manque de considération flagrant. Dans cette thèse, nous proposons une formulation globale du processus d'apprentissage dans un contexte MDE incluant une méthodologie complète pour caractériser et évaluer la relation qui existe entre la généralisabilité des solutions et deux propriétés importantes des exemples, leur taille et leur couverture. Nous effectuons l’analyse empirique de ces deux propriétés et nous proposons un plan détaillé pour une analyse plus approfondie du concept de représentativité, ou d’autres représentativités.Model-Driven Engineering (MDE) is a software development approach that proposes to raise the level of abstraction of languages in order to shift the design and understanding effort from a programmer point of view to the one of decision makers. However, the manipulation of these abstract representations, or models, has become so complex that traditional techniques are not enough to automate its inherent tasks. For its part, the Search-Based Software Engineering (SBSE) proposes to reformulate the automation of MDE tasks as optimization problems. Once reformulated, the problem will be solved by metaheuristic algorithms. With a plethora of studies on the subject, the power of automation of SBSE has been well established. Based on this observation, the Example-Based MDE community (EB-MDE) started using application examples to feed the reformulation into SBSE of the MDE task learning problem. In this context, the concordance of the output of the solutions with the examples becomes an effective barometer for evaluating the ability of a solution to solve a task. This measure has proved to be a semantic goal of choice to guide the metaheuristic search for solutions. However, while it is commonly accepted that the representativeness of the examples has an impact on the generalizability of the solutions, the study of this impact suffers from a flagrant lack of consideration. In this thesis, we propose a thorough formulation of the learning process in an MDE context including a complete methodology to characterize and evaluate the relation that exists between two important properties of the examples, their size and coverage, and the generalizability of the solutions. We perform an empirical analysis, and propose a detailed plan for further investigation of the concept of representativeness, or of other representativities

    Towards identifying salient patterns in genetic programming individuals

    Get PDF
    This thesis addresses the problem of offline identification of salient patterns in genetic programming individuals. It discusses the main issues related to automatic pattern identification systems, namely that these (a) should help in understanding the final solutions of the evolutionary run, (b) should give insight into the course of evolution and (c) should be helpful in optimizing future runs. Moreover, it proposes an algorithm, Extended Pattern Growing Algorithm ([E]PGA) to extract, filter and sort the identified patterns so that these fulfill as many as possible of the following criteria: (a) they are representative for the evolutionary run and/or search space, (b) they are human-friendly and (c) their numbers are within reasonable limits. The results are demonstrated on six problems from different domains

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde
    • …
    corecore