4,077 research outputs found
A Probabilistic Linear Genetic Programming with Stochastic Context-Free Grammar for solving Symbolic Regression problems
Traditional Linear Genetic Programming (LGP) algorithms are based only on the
selection mechanism to guide the search. Genetic operators combine or mutate
random portions of the individuals, without knowing if the result will lead to
a fitter individual. Probabilistic Model Building Genetic Programming (PMB-GP)
methods were proposed to overcome this issue through a probability model that
captures the structure of the fit individuals and use it to sample new
individuals. This work proposes the use of LGP with a Stochastic Context-Free
Grammar (SCFG), that has a probability distribution that is updated according
to selected individuals. We proposed a method for adapting the grammar into the
linear representation of LGP. Tests performed with the proposed probabilistic
method, and with two hybrid approaches, on several symbolic regression
benchmark problems show that the results are statistically better than the
obtained by the traditional LGP.Comment: Genetic and Evolutionary Computation Conference (GECCO) 2017, Berlin,
German
Integrating Evolutionary Computation with Neural Networks
There is a tremendous interest in the development of the evolutionary computation techniques as they are well suited to deal with optimization of functions containing a large number of variables. This paper presents a brief review of evolutionary computing techniques. It also discusses briefly the hybridization of evolutionary computation and neural networks and presents a solution of a classical problem using neural computing and evolutionary computing technique
A Field Guide to Genetic Programming
xiv, 233 p. : il. ; 23 cm.Libro ElectrĂłnicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction --
Representation, initialisation and operators in Tree-based GP --
Getting ready to run genetic programming --
Example genetic programming run --
Alternative initialisations and operators in Tree-based GP --
Modular, grammatical and developmental Tree-based GP --
Linear and graph genetic programming --
Probalistic genetic programming --
Multi-objective genetic programming --
Fast and distributed genetic programming --
GP theory and its applications --
Applications --
Troubleshooting GP --
Conclusions.Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell
1.2 Getting Started
1.3 Prerequisites
1.4 Overview of this Field Guide I
Basics
2 Representation, Initialisation and GP
2.1 Representation
2.2 Initialising the Population
2.3 Selection
2.4 Recombination and Mutation Operators in Tree-based
3 Getting Ready to Run Genetic Programming 19
3.1 Step 1: Terminal Set 19
3.2 Step 2: Function Set 20
3.2.1 Closure 21
3.2.2 Sufficiency 23
3.2.3 Evolving Structures other than Programs 23
3.3 Step 3: Fitness Function 24
3.4 Step 4: GP Parameters 26
3.5 Step 5: Termination and solution designation 27
4 Example Genetic Programming Run
4.1 Preparatory Steps 29
4.2 Step-by-Step Sample Run 31
4.2.1 Initialisation 31
4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming
5 Alternative Initialisations and Operators in
5.1 Constructing the Initial Population
5.1.1 Uniform Initialisation
5.1.2 Initialisation may Affect Bloat
5.1.3 Seeding
5.2 GP Mutation
5.2.1 Is Mutation Necessary?
5.2.2 Mutation Cookbook
5.3 GP Crossover
5.4 Other Techniques 32
5.5 Tree-based GP 39
6 Modular, Grammatical and Developmental Tree-based GP 47
6.1 Evolving Modular and Hierarchical Structures 47
6.1.1 Automatically Defined Functions 48
6.1.2 Program Architecture and Architecture-Altering 50
6.2 Constraining Structures 51
6.2.1 Enforcing Particular Structures 52
6.2.2 Strongly Typed GP 52
6.2.3 Grammar-based Constraints 53
6.2.4 Constraints and Bias 55
6.3 Developmental Genetic Programming 57
6.4 Strongly Typed Autoconstructive GP with PushGP 59
7 Linear and Graph Genetic Programming 61
7.1 Linear Genetic Programming 61
7.1.1 Motivations 61
7.1.2 Linear GP Representations 62
7.1.3 Linear GP Operators 64
7.2 Graph-Based Genetic Programming 65
7.2.1 Parallel Distributed GP (PDGP) 65
7.2.2 PADO 67
7.2.3 Cartesian GP 67
7.2.4 Evolving Parallel Programs using Indirect Encodings 68
8 Probabilistic Genetic Programming
8.1 Estimation of Distribution Algorithms 69
8.2 Pure EDA GP 71
8.3 Mixing Grammars and Probabilities 74
9 Multi-objective Genetic Programming 75
9.1 Combining Multiple Objectives into a Scalar Fitness Function 75
9.2 Keeping the Objectives Separate 76
9.2.1 Multi-objective Bloat and Complexity Control 77
9.2.2 Other Objectives 78
9.2.3 Non-Pareto Criteria 80
9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80
9.4 Multi-objective Optimisation via Operator Bias 81
10 Fast and Distributed Genetic Programming 83
10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83
10.2 Reducing Cost of Fitness with Caches 86
10.3 Parallel and Distributed GP are Not Equivalent 88
10.4 Running GP on Parallel Hardware 89
10.4.1 Masterâslave GP 89
10.4.2 GP Running on GPUs 90
10.4.3 GP on FPGAs 92
10.4.4 Sub-machine-code GP 93
10.5 Geographically Distributed GP 93
11 GP Theory and its Applications 97
11.1 Mathematical Models 98
11.2 Search Spaces 99
11.3 Bloat 101
11.3.1 Bloat in Theory 101
11.3.2 Bloat Control in Practice 104
III
Practical Genetic Programming
12 Applications
12.1 Where GP has Done Well
12.2 Curve Fitting, Data Modelling and Symbolic Regression
12.3 Human Competitive Results â the Humies
12.4 Image and Signal Processing
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control
12.7 Medicine, Biology and Bioinformatics
12.8 GP to Create Searchers and Solvers â Hyper-heuristics xiii
12.9 Entertainment and Computer Games 127
12.10The Arts 127
12.11Compression 128
13 Troubleshooting GP
13.1 Is there a Bug in the Code?
13.2 Can you Trust your Results?
13.3 There are No Silver Bullets
13.4 Small Changes can have Big Effects
13.5 Big Changes can have No Effect
13.6 Study your Populations
13.7 Encourage Diversity
13.8 Embrace Approximation
13.9 Control Bloat
13.10 Checkpoint Results
13.11 Report Well
13.12 Convince your Customers
14 Conclusions
Tricks of the Trade
A Resources
A.1 Key Books
A.2 Key Journals
A.3 Key International Meetings
A.4 GP Implementations
A.5 On-Line Resources 145
B TinyGP 151
B.1 Overview of TinyGP 151
B.2 Input Data Files for TinyGP 153
B.3 Source Code 154
B.4 Compiling and Running TinyGP 162
Bibliography 167
Inde
Automatic programming methodologies for electronic hardware fault monitoring
This paper presents three variants of Genetic Programming (GP) approaches for intelligent online performance monitoring of electronic circuits and systems. Reliability modeling of electronic circuits can be best performed by the Stressor - susceptibility interaction model. A circuit or a system is considered to be failed once the stressor has exceeded the susceptibility limits. For on-line prediction, validated stressor vectors may be obtained by direct measurements or sensors, which after pre-processing and standardization are fed into the GP models. Empirical results are compared with artificial neural networks trained using backpropagation algorithm and classification and regression trees. The performance of the proposed method is evaluated by comparing the experiment results with the actual failure model values. The developed model reveals that GP could play an important role for future fault monitoring systems.This research was supported by the International Joint Research Grant of the IITA (Institute of Information Technology Assessment) foreign professor invitation program of the MIC (Ministry of Information and Communication), Korea
Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models
The emergence and development of cancer is a consequence of the accumulation
over time of genomic mutations involving a specific set of genes, which
provides the cancer clones with a functional selective advantage. In this work,
we model the order of accumulation of such mutations during the progression,
which eventually leads to the disease, by means of probabilistic graphic
models, i.e., Bayesian Networks (BNs). We investigate how to perform the task
of learning the structure of such BNs, according to experimental evidence,
adopting a global optimization meta-heuristics. In particular, in this work we
rely on Genetic Algorithms, and to strongly reduce the execution time of the
inference -- which can also involve multiple repetitions to collect
statistically significant assessments of the data -- we distribute the
calculations using both multi-threading and a multi-node architecture. The
results show that our approach is characterized by good accuracy and
specificity; we also demonstrate its feasibility, thanks to a 84x reduction of
the overall execution time with respect to a traditional sequential
implementation
A Probabilistic One-Step Approach to the Optimal Product Line Design Problem Using Conjoint and Cost Data
Designing and pricing new products is one of the most critical activities for a firm, and it is well-known that taking into account consumer preferences for design decisions is essential for products later to be successful in a competitive environment (e.g., Urban and Hauser 1993). Consequently, measuring consumer preferences among multiattribute alternatives has been a primary concern in marketing research as well, and among many methodologies developed, conjoint analysis (Green and Rao 1971) has turned out to be one of the most widely used preference-based techniques for identifying and evaluating new product concepts. Moreover, a number of conjoint-based models with special focus on mathematical programming techniques for optimal product (line) design have been proposed (e.g., Zufryden 1977, 1982, Green and Krieger 1985, 1987b, 1992, Kohli and Krishnamurti 1987, Kohli and Sukumar 1990, Dobson and Kalish 1988, 1993, Balakrishnan and Jacob 1996, Chen and Hausman 2000). These models are directed at determining optimal product concepts using consumers' idiosyncratic or segment level part-worth preference functions estimated previously within a conjoint framework. Recently, Balakrishnan and Jacob (1996) have proposed the use of Genetic Algorithms (GA) to solve the problem of identifying a share maximizing single product design using conjoint data. In this paper, we follow Balakrishnan and Jacob's idea and employ and evaluate the GA approach with regard to the problem of optimal product line design. Similar to the approaches of Kohli and Sukumar (1990) and Nair et al. (1995), product lines are constructed directly from part-worths data obtained by conjoint analysis, which can be characterized as a one-step approach to product line design. In contrast, a two-step approach would start by first reducing the total set of feasible product profiles to a smaller set of promising items (reference set of candidate items) from which the products that constitute a product line are selected in a second step. Two-step approaches or partial models for either the first or second stage in this context have been proposed by Green and Krieger (1985, 1987a, 1987b, 1989), McBride and Zufryden (1988), Dobson and Kalish (1988, 1993) and, more recently, by Chen and Hausman (2000). Heretofore, with the only exception of Chen and Hausman's (2000) probabilistic model, all contributors to the literature on conjoint-based product line design have employed a deterministic, first-choice model of idiosyncratic preferences. Accordingly, a consumer is assumed to choose from her/his choice set the product with maximum perceived utility with certainty. However, the first choice rule seems to be an assumption too rigid for many product categories and individual choice situations, as the analyst often won't be in a position to control for all relevant variables influencing consumer behavior (e.g., situational factors). Therefore, in agreement with Chen and Hausman (2000), we incorporate a probabilistic choice rule to provide a more flexible representation of the consumer decision making process and start from segment-specific conjoint models of the conditional multinomial logit type. Favoring the multinomial logit model doesn't imply rejection of the widespread max-utility rule, as the MNL includes the option of mimicking this first choice rule. We further consider profit as a firm's economic criterion to evaluate decisions and introduce fixed and variable costs for each product profile. However, the proposed methodology is flexible enough to accomodate for other goals like market share (as well as for any other probabilistic choice rule). This model flexibility is provided by the implemented Genetic Algorithm as the underlying solver for the resulting nonlinear integer programming problem. Genetic Algorithms merely use objective function information (in the present context on expected profits of feasible product line solutions) and are easily adjustable to different objectives without the need for major algorithmic modifications. To assess the performance of the GA methodology for the product line design problem, we employ sensitivity analysis and Monte Carlo simulation. Sensitivity analysis is carried out to study the performance of the Genetic Algorithm w.r.t. varying GA parameter values (population size, crossover probability, mutation rate) and to finetune these values in order to provide near optimal solutions. Based on more than 1500 sensitivity runs applied to different problem sizes ranging from 12.650 to 10.586.800 feasible product line candidate solutions, we can recommend: (a) as expected, that a larger problem size be accompanied by a larger population size, with a minimum popsize of 130 for small problems and a minimum popsize of 250 for large problems, (b) a crossover probability of at least 0.9 and (c) an unexpectedly high mutation rate of 0.05 for small/medium-sized problems and a mutation rate in the order of 0.01 for large problem sizes. Following the results of the sensitivity analysis, we evaluated the GA performance for a large set of systematically varying market scenarios and associated problem sizes. We generated problems using a 4-factorial experimental design which varied by the number of attributes, number of levels in each attribute, number of items to be introduced by a new seller and number of competing firms except the new seller. The results of the Monte Carlo study with a total of 276 data sets that were analyzed show that the GA works efficiently in both providing near optimal product line solutions and CPU time. Particularly, (a) the worst-case performance ratio of the GA observed in a single run was 96.66%, indicating that the profit of the best product line solution found by the GA was never less than 96.66% of the profit of the optimal product line, (b) the hit ratio of identifying the optimal solution was 84.78% (234 out of 276 cases) and (c) it tooks at most 30 seconds for the GA to converge. Considering the option of Genetic Algorithms for repeated runs with (slightly) changed parameter settings and/or different initial populations (as opposed to many other heuristics) further improves the chances of finding the optimal solution.
Learning Computer Programs with the Bayesian Optimization Algorithm
The hierarchical Bayesian Optimization Algorithm (hBOA) [24, 25] learns bit-strings by constructing explicit centralized models of a population and using them to generate new instances. This thesis is concerned with extending hBOA to learning open-ended program trees. The new system, BOA programming (BOAP), improves on previous probabilistic model building GP systems (PMBGPs) in terms of the expressiveness and open-ended ïŹexibility of the models learned, and hence control over the distribution of individuals generated. BOAP is studied empirically on a toy problem (learning linear functions) in various conïŹgurations, and further experimental results are presented for two real-world problems: prediction of sunspot time series, and human gene function inference
Recommended from our members
Combinatorial optimization and metaheuristics
Today, combinatorial optimization is one of the youngest and most active areas of discrete mathematics. It is a branch of optimization in applied mathematics and computer science, related to operational research, algorithm theory and computational complexity theory. It sits at the intersection of several fields, including artificial intelligence, mathematics and software engineering. Its increasing interest arises for the fact that a large number of scientific and industrial problems can be formulated as abstract combinatorial optimization problems, through graphs and/or (integer) linear programs. Some of these problems have polynomial-time (âefficientâ) algorithms, while most of them are NP-hard, i.e. it is not proved that they can be solved in polynomial-time. Mainly, it means that it is not possible to guarantee that an exact solution to the problem can be found and one has to settle for an approximate solution with known performance guarantees. Indeed, the goal of approximate methods is to find âquicklyâ (reasonable run-times), with âhighâ probability, provable âgoodâ solutions (low error from the real optimal solution). In the last 20 years, a new kind of algorithm commonly called metaheuristics have emerged in this class, which basically try to combine heuristics in high level frameworks aimed at efficiently and effectively exploring the search space. This report briefly outlines the components, concepts, advantages and disadvantages of different metaheuristic approaches from a conceptual point of view, in order to analyze their similarities and differences. The two very significant forces of intensification and diversification, that mainly determine the behavior of a metaheuristic, will be pointed out. The report concludes by exploring the importance of hybridization and integration methods
Recommended from our members
Automatic Generation of Cognitive Theories using Genetic Programming
Cognitive neuroscience is the branch of neuroscience that studies the neural mechanisms underpinning cognition and develops theories explaining them. Within cognitive neuroscience, computational neuroscience focuses on modeling behavior, using theories expressed as computer programs. Up to now, computational theories have been formulated by neuroscientists. In this paper, we present a new approach to theory development in neuroscience: the automatic generation and testing of cognitive theories using genetic programming. Our approach evolves from experimental data cognitive theories that explain âthe mental programâ that subjects use to solve a specific task. As an example, we have focused on a typical neuroscience experiment, the delayed-match-to-sample (DMTS) task. The main goal of our approach is to develop a tool that neuroscientists can use to develop better cognitive theories
- âŠ