Search CORE

12 research outputs found

Significance-based Estimation-of-Distribution Algorithms

Author: Doerr Benjamin
Krejca Martin
Publication venue
Publication date: 11/10/2018
Field of study

Estimation-of-distribution algorithms (EDAs) are randomized search heuristics that maintain a probabilistic model of the solution space. This model is updated from iteration to iteration, based on the quality of the solutions sampled according to the model. As previous works show, this short-term perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. Such frequencies take long to be moved back to the middle range, leading to significant performance losses. In order to overcome this problem, we propose a new EDA based on the classic compact genetic algorithm (cGA) that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significance-based compact genetic algorithm (sig-cGA) optimizes the commonly regarded benchmark functions OneMax, LeadingOnes, and BinVal all in

O(n\log n)

time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed scGA -- an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model -- we prove that it optimizes OneMax only in a time exponential in the hypothetical population size

1/\rho

. Similarly, we show that the convex search algorithm cannot optimize OneMax in polynomial time

arXiv.org e-Print Archive

HAL-Polytechnique

From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm

Author: Brian
Böttcher Süntje
Cláudio
Doerr Benjamin
Doerr Benjamin
Doerr Benjamin
Friedrich Tobias
Georges
Krejca Martin
Mühlenbein Heinz
Mühlenbein Heinz
Publication venue
Publication date: 08/07/2020
Field of study

One of the key difficulties in using estimation-of-distribution algorithms is choosing the population size(s) appropriately: Too small values lead to genetic drift, which can cause enormous difficulties. In the regime with no genetic drift, however, often the runtime is roughly proportional to the population size, which renders large population sizes inefficient. Based on a recent quantitative analysis which population sizes lead to genetic drift, we propose a parameter-less version of the compact genetic algorithm that automatically finds a suitable population size without spending too much time in situations unfavorable due to genetic drift. We prove a mathematical runtime guarantee for this algorithm and conduct an extensive experimental analysis on four classic benchmark problems both without and with additive centered Gaussian posterior noise. The former shows that under a natural assumption, our algorithm has a performance very similar to the one obtainable from the best problem-specific population size. The latter confirms that missing the right population size in the original cGA can be detrimental and that previous theory-based suggestions for the population size can be far away from the right values; it also shows that our algorithm as well as a previously proposed parameter-less variant of the cGA based on parallel runs avoid such pitfalls. Comparing the two parameter-less approaches, ours profits from its ability to abort runs which are likely to be stuck in a genetic drift situation.Comment: 4 figures. Extended version of a paper appearing at GECCO 202

arXiv.org e-Print Archive

Crossref

HAL-Polytechnique

A Tight Runtime Analysis for the cGA on Jump Functions---EDAs Can Cross Fitness Valleys at No Extra Cost

Author: Antipov Denis
Corus Dogan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

We prove that the compact genetic algorithm (cGA) with hypothetical population size

\mu = \Omega(\sqrt n \log n) \cap \text{poly}(n)

with high probability finds the optimum of any

n

-dimensional jump function with jump size

k < \frac 1 {20} \ln n

O(\mu \sqrt n)

iterations. Since it is known that the cGA with high probability needs at least

\Omega(\mu \sqrt n + n \log n)

iterations to optimize the unimodal OneMax function, our result shows that the cGA in contrast to most classic evolutionary algorithms here is able to cross moderate-sized valleys of low fitness at no extra cost. Our runtime guarantee improves over the recent upper bound

O(\mu n^{1.5} \log n)

valid for

\mu = \Omega(n^{3.5+\varepsilon})

of Hasen\"ohrl and Sutton (GECCO 2018). For the best choice of the hypothetical population size, this result gives a runtime guarantee of

O(n^{5+\varepsilon})

, whereas ours gives

O(n \log n)

. We also provide a simple general method based on parallel runs that, under mild conditions, (i)~overcomes the need to specify a suitable population size, but gives a performance close to the one stemming from the best-possible population size, and (ii)~transforms EDAs with high-probability performance guarantees into EDAs with similar bounds on the expected runtime.Comment: 25 pages, full version of a paper to appear at GECCO 201

arXiv.org e-Print Archive

Crossref

HAL-Polytechnique

Self-Adjusting Evolutionary Algorithms for Multimodal Optimization

Author: Jonathan
Lissovoi Andrei
Wegener Ingo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/06/2020
Field of study

Recent theoretical research has shown that self-adjusting and self-adaptive mechanisms can provably outperform static settings in evolutionary algorithms for binary search spaces. However, the vast majority of these studies focuses on unimodal functions which do not require the algorithm to flip several bits simultaneously to make progress. In fact, existing self-adjusting algorithms are not designed to detect local optima and do not have any obvious benefit to cross large Hamming gaps. We suggest a mechanism called stagnation detection that can be added as a module to existing evolutionary algorithms (both with and without prior self-adjusting algorithms). Added to a simple (1+1) EA, we prove an expected runtime on the well-known Jump benchmark that corresponds to an asymptotically optimal parameter setting and outperforms other mechanisms for multimodal optimization like heavy-tailed mutation. We also investigate the module in the context of a self-adjusting (1+

\lambda

) EA and show that it combines the previous benefits of this algorithm on unimodal problems with more efficient multimodal optimization. To explore the limitations of the approach, we additionally present an example where both self-adjusting mechanisms, including stagnation detection, do not help to find a beneficial setting of the mutation rate. Finally, we investigate our module for stagnation detection experimentally.Comment: 26 pages. Full version of a paper appearing at GECCO 202

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

On the limitations of the univariate marginal distribution algorithm to deception and where bivariate EDAs might help

Author: Lehre Per Kristian
Nguyen Phan Trung Hai
Publication venue: Association for Computing Machinery (ACM)
Publication date: 29/07/2019
Field of study

We introduce a new benchmark problem called Deceptive Leading Blocks (DLB) to rigorously study the runtime of the Univariate Marginal Distribution Algorithm (UMDA) in the presence of epistasis and deception. We show that simple Evolutionary Algorithms (EAs) outperform the UMDA unless the selective pressure

\mu/\lambda

is extremely high, where

\mu

and

\lambda

are the parent and offspring population sizes, respectively. More precisely, we show that the UMDA with a parent population size of

\mu=\Omega(\log n)

has an expected runtime of

e^{\Omega(\mu)}

on the DLB problem assuming any selective pressure

\frac{\mu}{\lambda} \geq \frac{14}{1000}

, as opposed to the expected runtime of

\mathcal{O}(n\lambda\log \lambda+n^3)

for the non-elitist

(\mu,\lambda)~\text{EA}

with

\mu/\lambda\leq 1/e

. These results illustrate inherent limitations of univariate EDAs against deception and epistasis, which are common characteristics of real-world problems. In contrast, empirical evidence reveals the efficiency of the bi-variate MIMIC algorithm on the DLB problem. Our results suggest that one should consider EDAs with more complex probabilistic models when optimising problems with some degree of epistasis and deception.Comment: To appear in the 15th ACM/SIGEVO Workshop on Foundations of Genetic Algorithms (FOGA XV), Potsdam, German

arXiv.org e-Print Archive

University of Birmingham Research Portal

Level-Based Analysis of the Univariate Marginal Distribution Algorithm

Author: Dang Duc-Cuong
Lehre Per Kristian
Nguyen Phan Trung Hai
Publication venue
Publication date: 26/07/2018
Field of study

Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that search for optimal solutions by learning and sampling from probabilistic models. Despite their popularity in real-world applications, there is little rigorous understanding of their performance. Even for the Univariate Marginal Distribution Algorithm (UMDA) -- a simple population-based EDA assuming independence between decision variables -- the optimisation time on the linear problem OneMax was until recently undetermined. The incomplete theoretical understanding of EDAs is mainly due to lack of appropriate analytical tools. We show that the recently developed level-based theorem for non-elitist populations combined with anti-concentration results yield upper bounds on the expected optimisation time of the UMDA. This approach results in the bound

\mathcal{O}(n\lambda\log \lambda+n^2)

on two problems, LeadingOnes and BinVal, for population sizes

\lambda>\mu=\Omega(\log n)

, where

\mu

and

\lambda

are parameters of the algorithm. We also prove that the UMDA with population sizes

\mu\in \mathcal{O}(\sqrt{n}) \cap \Omega(\log n)

optimises OneMax in expected time

\mathcal{O}(\lambda n)

, and for larger population sizes

\mu=\Omega(\sqrt{n}\log n)

, in expected time

\mathcal{O}(\lambda\sqrt{n})

. The facility and generality of our arguments suggest that this is a promising approach to derive bounds on the expected optimisation time of EDAs.Comment: To appear in Algorithmica Journa

arXiv.org e-Print Archive

University of Birmingham Research Portal

Level-based analysis of the univariate marginal distribution algorithm

Author: Dang Duc-Cuong
Lehre Per Kristian
Nguyen Phan Trung Hai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2019
Field of study

University of Birmingham Research Portal

Simple hyper-heuristics control the neighbourhood size of randomised local search optimally for LeadingOnes

Author: Lissovoi A.
Oliveto P.
Warwicker J.A.
Publication venue: 'MIT Press - Journals'
Publication date: 01/09/2020
Field of study

Selection hyper-heuristics (HHs) are randomised search methodologies which choose and execute heuristics during the optimisation process from a set of low-level heuristics. A machine learning mechanism is generally used to decide which low-level heuristic should be applied in each decision step. In this paper we analyse whether sophisticated learning mechanisms are always necessary for HHs to perform well. To this end we consider the most simple HHs from the literature and rigorously analyse their performance for the LeadingOnes benchmark function. Our analysis shows that the standard Simple Random, Permutation, Greedy and Random Gradient HHs show no signs of learning. While the former HHs do not attempt to learn from the past performance of low-level heuristics, the idea behind the Random Gradient HH is to continue to exploit the currently selected heuristic as long as it is successful. Hence, it is embedded with a reinforcement learning mechanism with the shortest possible memory. However, the probability that a promising heuristic is successful in the next step is relatively low when perturbing a reasonable solution to a combinatorial optimisation problem. We generalise the `simple' Random Gradient HH so success can be measured over a fixed period of time τ, instead of a single iteration. For LeadingOnes we prove that the Generalised Random Gradient (GRG) HH can learn to adapt the neighbourhood size of Randomised Local Search to optimality during the run. As a result, we prove it has the best possible performance achievable with the low-level heuristics (Randomised Local Search with different neighbourhood sizes), up to lower order terms. We also prove that the performance of the HH improves as the number of low-level local search heuristics to choose from increases. In particular, with access to k low-level local search heuristics, it outperforms the best-possible algorithm using any subset of the k heuristics. Finally, we show that the advantages of GRG over Randomised Local Search and Evolutionary Algorithms using standard bit mutation increase if the anytime performance is considered (i.e., the performance gap is larger if approximate solutions are sought rather than exact ones). Experimental analyses confirm these results for different problem sizes (up to n = 108) and shed some light on the best choices for the parameter τ in various situations

White Rose Research Online