39 research outputs found

    On the limitations of the univariate marginal distribution algorithm to deception and where bivariate EDAs might help

    Get PDF
    We introduce a new benchmark problem called Deceptive Leading Blocks (DLB) to rigorously study the runtime of the Univariate Marginal Distribution Algorithm (UMDA) in the presence of epistasis and deception. We show that simple Evolutionary Algorithms (EAs) outperform the UMDA unless the selective pressure μ/λ\mu/\lambda is extremely high, where μ\mu and λ\lambda are the parent and offspring population sizes, respectively. More precisely, we show that the UMDA with a parent population size of μ=Ω(logn)\mu=\Omega(\log n) has an expected runtime of eΩ(μ)e^{\Omega(\mu)} on the DLB problem assuming any selective pressure μλ141000\frac{\mu}{\lambda} \geq \frac{14}{1000}, as opposed to the expected runtime of O(nλlogλ+n3)\mathcal{O}(n\lambda\log \lambda+n^3) for the non-elitist (μ,λ) EA(\mu,\lambda)~\text{EA} with μ/λ1/e\mu/\lambda\leq 1/e. These results illustrate inherent limitations of univariate EDAs against deception and epistasis, which are common characteristics of real-world problems. In contrast, empirical evidence reveals the efficiency of the bi-variate MIMIC algorithm on the DLB problem. Our results suggest that one should consider EDAs with more complex probabilistic models when optimising problems with some degree of epistasis and deception.Comment: To appear in the 15th ACM/SIGEVO Workshop on Foundations of Genetic Algorithms (FOGA XV), Potsdam, German

    The Univariate Marginal Distribution Algorithm Copes Well With Deception and Epistasis

    Full text link
    In their recent work, Lehre and Nguyen (FOGA 2019) show that the univariate marginal distribution algorithm (UMDA) needs time exponential in the parent populations size to optimize the DeceptiveLeadingBlocks (DLB) problem. They conclude from this result that univariate EDAs have difficulties with deception and epistasis. In this work, we show that this negative finding is caused by an unfortunate choice of the parameters of the UMDA. When the population sizes are chosen large enough to prevent genetic drift, then the UMDA optimizes the DLB problem with high probability with at most λ(n2+2elnn)\lambda(\frac{n}{2} + 2 e \ln n) fitness evaluations. Since an offspring population size λ\lambda of order nlognn \log n can prevent genetic drift, the UMDA can solve the DLB problem with O(n2logn)O(n^2 \log n) fitness evaluations. In contrast, for classic evolutionary algorithms no better run time guarantee than O(n3)O(n^3) is known (which we prove to be tight for the (1+1){(1+1)} EA), so our result rather suggests that the UMDA can cope well with deception and epistatis. From a broader perspective, our result shows that the UMDA can cope better with local optima than evolutionary algorithms; such a result was previously known only for the compact genetic algorithm. Together with the lower bound of Lehre and Nguyen, our result for the first time rigorously proves that running EDAs in the regime with genetic drift can lead to drastic performance losses

    A review on probabilistic graphical models in evolutionary computation

    Get PDF
    Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms

    From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm

    Full text link
    One of the key difficulties in using estimation-of-distribution algorithms is choosing the population size(s) appropriately: Too small values lead to genetic drift, which can cause enormous difficulties. In the regime with no genetic drift, however, often the runtime is roughly proportional to the population size, which renders large population sizes inefficient. Based on a recent quantitative analysis which population sizes lead to genetic drift, we propose a parameter-less version of the compact genetic algorithm that automatically finds a suitable population size without spending too much time in situations unfavorable due to genetic drift. We prove a mathematical runtime guarantee for this algorithm and conduct an extensive experimental analysis on four classic benchmark problems both without and with additive centered Gaussian posterior noise. The former shows that under a natural assumption, our algorithm has a performance very similar to the one obtainable from the best problem-specific population size. The latter confirms that missing the right population size in the original cGA can be detrimental and that previous theory-based suggestions for the population size can be far away from the right values; it also shows that our algorithm as well as a previously proposed parameter-less variant of the cGA based on parallel runs avoid such pitfalls. Comparing the two parameter-less approaches, ours profits from its ability to abort runs which are likely to be stuck in a genetic drift situation.Comment: 4 figures. Extended version of a paper appearing at GECCO 202

    Runtime analyses of univariate estimation of distribution algorithms under linearity, epistasis and deception

    Get PDF
    Estimation of distribution algorithms (EDAs) have been successfully applied to solve many real-world optimisation problems. The algorithms work by building and maintaining probabilistic models over the search space and are widely considered a generalisation of the evolutionary algorithms (EAs). While the theory of EAs has been enriched significantly over the last decades, our understandings of EDAs are sparse and limited. The past few years have seen some progress in this topic, showing competitive performance compared to other EAs on some simple test functions. This thesis studies the so-called univariate EDAs by rigorously analysing their time complexities on different fitness landscapes. Firstly, I show that the algorithms optimise the ONEMAX function as efficiently as the (1+1) EA does. I then investigate the algorithms’ ability to cope with dependencies among decision variables. Despite the independence assumption, the algorithms optimise LEADINGONES – a test function with an epistasis level of (n−1) – using at most O(n2^2) function evaluations under appropriate parameter settings. I also show that if the selection rate μ/λ is above some constant threshold, an exponential runtime is inevitable to optimise the function. Finally, I confirm the common belief that univariate EDAs have difficulties optimising some objective function when deception occurs. By introducing a new test function with a very mild degree of deception, I show that the UMDA takes an exponential runtime unless the selection rate is chosen extremely high, i.e., μ/λ = O(1/μ). This thesis demonstrates that while univariate EDAs may cope well with independence and epistasis in the environment, the algorithms suffer even at a mild level of deception and that researchers might need to adopt multivariate EDAs when facing deceptive objective functions

    Substructural local search in discrete estimation of distribution algorithms

    Get PDF
    Tese dout., Engenharia Electrónica e Computação, Universidade do Algarve, 2009SFRH/BD/16980/2004The last decade has seen the rise and consolidation of a new trend of stochastic optimizers known as estimation of distribution algorithms (EDAs). In essence, EDAs build probabilistic models of promising solutions and sample from the corresponding probability distributions to obtain new solutions. This approach has brought a new view to evolutionary computation because, while solving a given problem with an EDA, the user has access to a set of models that reveal probabilistic dependencies between variables, an important source of information about the problem. This dissertation proposes the integration of substructural local search (SLS) in EDAs to speedup the convergence to optimal solutions. Substructural neighborhoods are de ned by the structure of the probabilistic models used in EDAs, generating adaptive neighborhoods capable of automatic discovery and exploitation of problem regularities. Speci cally, the thesis focuses on the extended compact genetic algorithm and the Bayesian optimization algorithm. The utility of SLS in EDAs is investigated for a number of boundedly di cult problems with modularity, overlapping, and hierarchy, while considering important aspects such as scaling and noise. The results show that SLS can substantially reduce the number of function evaluations required to solve some of these problems. More importantly, the speedups obtained can scale up to the square root of the problem size O( p `).Fundação para a Ciência e Tecnologia (FCT

    The role of Walsh structure and ordinal linkage in the optimisation of pseudo-Boolean functions under monotonicity invariance.

    Get PDF
    Optimisation heuristics rely on implicit or explicit assumptions about the structure of the black-box fitness function they optimise. A review of the literature shows that understanding of structure and linkage is helpful to the design and analysis of heuristics. The aim of this thesis is to investigate the role that problem structure plays in heuristic optimisation. Many heuristics use ordinal operators; which are those that are invariant under monotonic transformations of the fitness function. In this thesis we develop a classification of pseudo-Boolean functions based on rank-invariance. This approach classifies functions which are monotonic transformations of one another as equivalent, and so partitions an infinite set of functions into a finite set of classes. Reasoning about heuristics composed of ordinal operators is, by construction, invariant over these classes. We perform a complete analysis of 2-bit and 3-bit pseudo-Boolean functions. We use Walsh analysis to define concepts of necessary, unnecessary, and conditionally necessary interactions, and of Walsh families. This helps to make precise some existing ideas in the literature such as benign interactions. Many algorithms are invariant under the classes we define, which allows us to examine the difficulty of pseudo-Boolean functions in terms of function classes. We analyse a range of ordinal selection operators for an EDA. Using a concept of directed ordinal linkage, we define precedence networks and precedence profiles to represent key algorithmic steps and their interdependency in terms of problem structure. The precedence profiles provide a measure of problem difficulty. This corresponds to problem difficulty and algorithmic steps for optimisation. This work develops insight into the relationship between function structure and problem difficulty for optimisation, which may be used to direct the development of novel algorithms. Concepts of structure are also used to construct easy and hard problems for a hill-climber

    Adaptive scaling of evolvable systems

    Get PDF
    Neo-Darwinian evolution is an established natural inspiration for computational optimisation with a diverse range of forms. A particular feature of models such as Genetic Algorithms (GA) [18, 12] is the incremental combination of partial solutions distributed within a population of solutions. This mechanism in principle allows certain problems to be solved which would not be amenable to a simple local search. Such problems require these partial solutions, generally known as building-blocks, to be handled without disruption. The traditional means for this is a combination of a suitable chromosome ordering with a sympathetic recombination operator. More advanced algorithms attempt to adapt to accommodate these dependencies during the search. The recent approach of Estimation of Distribution Algorithms (EDA) aims to directly infer a probabilistic model of a promising population distribution from a sample of fitter solutions [23]. This model is then sampled to generate a new solution set. A symbiotic view of evolution is behind the recent development of the Compositional Search Evolutionary Algorithms (CSEA) [49, 19, 8] which build up an incremental model of variable dependencies conditional on a series of tests. Building-blocks are retained as explicit genetic structures and conditionally joined to form higher-order structures. These have been shown to be effective on special classes of hierarchical problems but are unproven on less tightly-structured problems. We propose that there exists a simple yet powerful combination of the above approaches: the persistent, adapting dependency model of a compositional pool with the expressive and compact variable weighting of probabilistic models. We review and deconstruct some of the key methods above for the purpose of determining their individual drawbacks and their common principles. By this reasoned approach we aim to arrive at a unifying framework that can adaptively scale to span a range of problem structure classes. This is implemented in a novel algorithm called the Transitional Evolutionary Algorithm (TEA). This is empirically validated in an incremental manner, verifying the various facets of the TEA and comparing it with related algorithms for an increasingly structured series of benchmark problems. This prompts some refinements to result in a simple and general algorithm that is nevertheless competitive with state-of-the-art methods

    Multivariate Markov networks for fitness modelling in an estimation of distribution algorithm.

    Get PDF
    A well-known paradigm for optimisation is the evolutionary algorithm (EA). An EA maintains a population of possible solutions to a problem which converges on a global optimum using biologically-inspired selection and reproduction operators. These algorithms have been shown to perform well on a variety of hard optimisation and search problems. A recent development in evolutionary computation is the Estimation of Distribution Algorithm (EDA) which replaces the traditional genetic reproduction operators (crossover and mutation) with the construction and sampling of a probabilistic model. While this can often represent a significant computational expense, the benefit is that the model contains explicit information about the fitness function. This thesis expands on recent work using a Markov network to model fitness in an EDA, resulting in what we call the Markov Fitness Model (MFM). The work has explored the theoretical foundations of the MFM approach which are grounded in Walsh analysis of fitness functions. This has allowed us to demonstrate a clear relationship between the fitness model and the underlying dynamics of the problem. A key achievement is that we have been able to show how the model can be used to predict fitness and have devised a measure of fitness modelling capability called the fitness prediction correlation (FPC). We have performed a series of experiments which use the FPC to investigate the effect of population size and selection operator on the fitness modelling capability. The results and analysis of these experiments are an important addition to other work on diversity and fitness distribution within populations. With this improved understanding of fitness modelling we have been able to extend the framework Distribution Estimation Using Markov networks (DEUM) to use a multivariate probabilistic model. We have proposed and demonstrated the performance of a number of algorithms based on this framework which lever the MFM for optimisation, which can now be added to the EA toolbox. As part of this we have investigated existing techniques for learning the structure of the MFM; a further contribution which results from this is the introduction of precision and recall as measures of structure quality. We have also proposed a number of possible directions that future work could take
    corecore