Search CORE

4,077 research outputs found

A Probabilistic Linear Genetic Programming with Stochastic Context-Free Grammar for solving Symbolic Regression problems

Author: Bosman P. A. N.
Poli R.
Shan Y.
Wong P. K.
Yanai K.
Yanai K.
Publication venue
Publication date: 03/04/2017
Field of study

Traditional Linear Genetic Programming (LGP) algorithms are based only on the selection mechanism to guide the search. Genetic operators combine or mutate random portions of the individuals, without knowing if the result will lead to a fitter individual. Probabilistic Model Building Genetic Programming (PMB-GP) methods were proposed to overcome this issue through a probability model that captures the structure of the fit individuals and use it to sample new individuals. This work proposes the use of LGP with a Stochastic Context-Free Grammar (SCFG), that has a probability distribution that is updated according to selected individuals. We proposed a method for adapting the grammar into the linear representation of LGP. Tests performed with the proposed probabilistic method, and with two hybrid approaches, on several symbolic regression benchmark problems show that the results are statistically better than the obtained by the traditional LGP.Comment: Genetic and Evolutionary Computation Conference (GECCO) 2017, Berlin, German

arXiv.org e-Print Archive

Crossref

Integrating Evolutionary Computation with Neural Networks

Author: Hibbs R.A.
Jain L.C.
Veelenturf L.P.J.
Vonk E.
Publication venue: IEEE
Publication date: 01/01/1995
Field of study

There is a tremendous interest in the development of the evolutionary computation techniques as they are well suited to deal with optimization of functions containing a large number of variables. This paper presents a brief review of evolutionary computing techniques. It also discusses briefly the hybridization of evolutionary computation and neural networks and presents a solution of a classical problem using neural computing and evolutionary computing technique

CiteSeerX

University of Twente Research Information

A Field Guide to Genetic Programming

Author: Langdon William B.
McPhee Nicholas F.
Poli Ricardo
Publication venue: [S.L.] : Lulu Press (lulu.com), 2008.
Publication date: 01/01/2008
Field of study

xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

Metabiblioteca-Biblioteca Digital Libros Abiertos

Automatic programming methodologies for electronic hardware fault monitoring

Author: Abraham A
Grosan C
Publication venue
Publication date: 01/01/2006
Field of study

This paper presents three variants of Genetic Programming (GP) approaches for intelligent online performance monitoring of electronic circuits and systems. Reliability modeling of electronic circuits can be best performed by the Stressor - susceptibility interaction model. A circuit or a system is considered to be failed once the stressor has exceeded the susceptibility limits. For on-line prediction, validated stressor vectors may be obtained by direct measurements or sensors, which after pre-processing and standardization are fed into the GP models. Empirical results are compared with artificial neural networks trained using backpropagation algorithm and classification and regression trees. The performance of the proposed method is evaluated by comparing the experiment results with the actual failure model values. The developed model reveals that GP could play an important role for future fault monitoring systems.This research was supported by the International Joint Research Grant of the IITA (Institute of Information Technology Assessment) foreign professor invitation program of the MIC (Ministry of Information and Communication), Korea

ZENODO

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Brunel University Research Archive

Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models

Author: Antoniotti Marco
Cazzaniga Paolo
Mauri Giancarlo
Nobile Marco S.
Ramazzotti Daniele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The emergence and development of cancer is a consequence of the accumulation over time of genomic mutations involving a specific set of genes, which provides the cancer clones with a functional selective advantage. In this work, we model the order of accumulation of such mutations during the progression, which eventually leads to the disease, by means of probabilistic graphic models, i.e., Bayesian Networks (BNs). We investigate how to perform the task of learning the structure of such BNs, according to experimental evidence, adopting a global optimization meta-heuristics. In particular, in this work we rely on Genetic Algorithms, and to strongly reduce the execution time of the inference -- which can also involve multiple repetitions to collect statistically significant assessments of the data -- we distribute the calculations using both multi-threading and a multi-node architecture. The results show that our approach is characterized by good accuracy and specificity; we also demonstrate its feasibility, thanks to a 84x reduction of the overall execution time with respect to a traditional sequential implementation

arXiv.org e-Print Archive

Repository TU/e

A Probabilistic One-Step Approach to the Optimal Product Line Design Problem Using Conjoint and Cost Data

Author: Harald Hruschka
Winfried Steiner
Publication venue
Publication date
Field of study

Designing and pricing new products is one of the most critical activities for a firm, and it is well-known that taking into account consumer preferences for design decisions is essential for products later to be successful in a competitive environment (e.g., Urban and Hauser 1993). Consequently, measuring consumer preferences among multiattribute alternatives has been a primary concern in marketing research as well, and among many methodologies developed, conjoint analysis (Green and Rao 1971) has turned out to be one of the most widely used preference-based techniques for identifying and evaluating new product concepts. Moreover, a number of conjoint-based models with special focus on mathematical programming techniques for optimal product (line) design have been proposed (e.g., Zufryden 1977, 1982, Green and Krieger 1985, 1987b, 1992, Kohli and Krishnamurti 1987, Kohli and Sukumar 1990, Dobson and Kalish 1988, 1993, Balakrishnan and Jacob 1996, Chen and Hausman 2000). These models are directed at determining optimal product concepts using consumers' idiosyncratic or segment level part-worth preference functions estimated previously within a conjoint framework. Recently, Balakrishnan and Jacob (1996) have proposed the use of Genetic Algorithms (GA) to solve the problem of identifying a share maximizing single product design using conjoint data. In this paper, we follow Balakrishnan and Jacob's idea and employ and evaluate the GA approach with regard to the problem of optimal product line design. Similar to the approaches of Kohli and Sukumar (1990) and Nair et al. (1995), product lines are constructed directly from part-worths data obtained by conjoint analysis, which can be characterized as a one-step approach to product line design. In contrast, a two-step approach would start by first reducing the total set of feasible product profiles to a smaller set of promising items (reference set of candidate items) from which the products that constitute a product line are selected in a second step. Two-step approaches or partial models for either the first or second stage in this context have been proposed by Green and Krieger (1985, 1987a, 1987b, 1989), McBride and Zufryden (1988), Dobson and Kalish (1988, 1993) and, more recently, by Chen and Hausman (2000). Heretofore, with the only exception of Chen and Hausman's (2000) probabilistic model, all contributors to the literature on conjoint-based product line design have employed a deterministic, first-choice model of idiosyncratic preferences. Accordingly, a consumer is assumed to choose from her/his choice set the product with maximum perceived utility with certainty. However, the first choice rule seems to be an assumption too rigid for many product categories and individual choice situations, as the analyst often won't be in a position to control for all relevant variables influencing consumer behavior (e.g., situational factors). Therefore, in agreement with Chen and Hausman (2000), we incorporate a probabilistic choice rule to provide a more flexible representation of the consumer decision making process and start from segment-specific conjoint models of the conditional multinomial logit type. Favoring the multinomial logit model doesn't imply rejection of the widespread max-utility rule, as the MNL includes the option of mimicking this first choice rule. We further consider profit as a firm's economic criterion to evaluate decisions and introduce fixed and variable costs for each product profile. However, the proposed methodology is flexible enough to accomodate for other goals like market share (as well as for any other probabilistic choice rule). This model flexibility is provided by the implemented Genetic Algorithm as the underlying solver for the resulting nonlinear integer programming problem. Genetic Algorithms merely use objective function information (in the present context on expected profits of feasible product line solutions) and are easily adjustable to different objectives without the need for major algorithmic modifications. To assess the performance of the GA methodology for the product line design problem, we employ sensitivity analysis and Monte Carlo simulation. Sensitivity analysis is carried out to study the performance of the Genetic Algorithm w.r.t. varying GA parameter values (population size, crossover probability, mutation rate) and to finetune these values in order to provide near optimal solutions. Based on more than 1500 sensitivity runs applied to different problem sizes ranging from 12.650 to 10.586.800 feasible product line candidate solutions, we can recommend: (a) as expected, that a larger problem size be accompanied by a larger population size, with a minimum popsize of 130 for small problems and a minimum popsize of 250 for large problems, (b) a crossover probability of at least 0.9 and (c) an unexpectedly high mutation rate of 0.05 for small/medium-sized problems and a mutation rate in the order of 0.01 for large problem sizes. Following the results of the sensitivity analysis, we evaluated the GA performance for a large set of systematically varying market scenarios and associated problem sizes. We generated problems using a 4-factorial experimental design which varied by the number of attributes, number of levels in each attribute, number of items to be introduced by a new seller and number of competing firms except the new seller. The results of the Monte Carlo study with a total of 276 data sets that were analyzed show that the GA works efficiently in both providing near optimal product line solutions and CPU time. Particularly, (a) the worst-case performance ratio of the GA observed in a single run was 96.66%, indicating that the profit of the best product line solution found by the GA was never less than 96.66% of the profit of the optimal product line, (b) the hit ratio of identifying the optimal solution was 84.78% (234 out of 276 cases) and (c) it tooks at most 30 seconds for the GA to converge. Considering the option of Genetic Algorithms for repeated runs with (slightly) changed parameter settings and/or different initial populations (as opposed to many other heuristics) further improves the chances of finding the optimal solution.

Research Papers in Economics

Learning Computer Programs with the Bayesian Optimization Algorithm

Author: Looks Moshe
Loui R. P.
Publication venue: Washington University Open Scholarship
Publication date: 06/04/2005
Field of study

The hierarchical Bayesian Optimization Algorithm (hBOA) [24, 25] learns bit-strings by constructing explicit centralized models of a population and using them to generate new instances. This thesis is concerned with extending hBOA to learning open-ended program trees. The new system, BOA programming (BOAP), improves on previous probabilistic model building GP systems (PMBGPs) in terms of the expressiveness and open-ended ﬂexibility of the models learned, and hence control over the distribution of individuals generated. BOAP is studied empirically on a toy problem (learning linear functions) in various conﬁgurations, and further experimental results are presented for two real-world problems: prediction of sunspot time series, and human gene function inference

Washington University St. Louis: Open Scholarship

Recommended from our members

Combinatorial optimization and metaheuristics

Author: Consoli S
Darby-Dowman K
Publication venue: Brunel University
Publication date: 01/01/2006
Field of study

Today, combinatorial optimization is one of the youngest and most active areas of discrete mathematics. It is a branch of optimization in applied mathematics and computer science, related to operational research, algorithm theory and computational complexity theory. It sits at the intersection of several fields, including artificial intelligence, mathematics and software engineering. Its increasing interest arises for the fact that a large number of scientific and industrial problems can be formulated as abstract combinatorial optimization problems, through graphs and/or (integer) linear programs. Some of these problems have polynomial-time (“efficient”) algorithms, while most of them are NP-hard, i.e. it is not proved that they can be solved in polynomial-time. Mainly, it means that it is not possible to guarantee that an exact solution to the problem can be found and one has to settle for an approximate solution with known performance guarantees. Indeed, the goal of approximate methods is to find “quickly” (reasonable run-times), with “high” probability, provable “good” solutions (low error from the real optimal solution). In the last 20 years, a new kind of algorithm commonly called metaheuristics have emerged in this class, which basically try to combine heuristics in high level frameworks aimed at efficiently and effectively exploring the search space. This report briefly outlines the components, concepts, advantages and disadvantages of different metaheuristic approaches from a conceptual point of view, in order to analyze their similarities and differences. The two very significant forces of intensification and diversification, that mainly determine the behavior of a metaheuristic, will be pointed out. The report concludes by exploring the importance of hybridization and integration methods

Brunel University Research Archive

Recommended from our members

Automatic Generation of Cognitive Theories using Genetic Programming

Author: Frias-Martinez E
Gobet F
Publication venue: Springer Verlag
Publication date: 05/09/2007
Field of study

Cognitive neuroscience is the branch of neuroscience that studies the neural mechanisms underpinning cognition and develops theories explaining them. Within cognitive neuroscience, computational neuroscience focuses on modeling behavior, using theories expressed as computer programs. Up to now, computational theories have been formulated by neuroscientists. In this paper, we present a new approach to theory development in neuroscience: the automatic generation and testing of cognitive theories using genetic programming. Our approach evolves from experimental data cognitive theories that explain “the mental program” that subjects use to solve a specific task. As an example, we have focused on a typical neuroscience experiment, the delayed-match-to-sample (DMTS) task. The main goal of our approach is to develop a tool that neuroscientists can use to develop better cognitive theories

Brunel University Research Archive