12 research outputs found
Significance-based Estimation-of-Distribution Algorithms
Estimation-of-distribution algorithms (EDAs) are randomized search heuristics
that maintain a probabilistic model of the solution space. This model is
updated from iteration to iteration, based on the quality of the solutions
sampled according to the model. As previous works show, this short-term
perspective can lead to erratic updates of the model, in particular, to
bit-frequencies approaching a random boundary value. Such frequencies take long
to be moved back to the middle range, leading to significant performance
losses.
In order to overcome this problem, we propose a new EDA based on the classic
compact genetic algorithm (cGA) that takes into account a longer history of
samples and updates its model only with respect to information which it
classifies as statistically significant. We prove that this significance-based
compact genetic algorithm (sig-cGA) optimizes the commonly regarded benchmark
functions OneMax, LeadingOnes, and BinVal all in time, a result
shown for no other EDA or evolutionary algorithm so far.
For the recently proposed scGA -- an EDA that tries to prevent erratic model
updates by imposing a bias to the uniformly distributed model -- we prove that
it optimizes OneMax only in a time exponential in the hypothetical population
size . Similarly, we show that the convex search algorithm cannot
optimize OneMax in polynomial time
From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm
One of the key difficulties in using estimation-of-distribution algorithms is
choosing the population size(s) appropriately: Too small values lead to genetic
drift, which can cause enormous difficulties. In the regime with no genetic
drift, however, often the runtime is roughly proportional to the population
size, which renders large population sizes inefficient.
Based on a recent quantitative analysis which population sizes lead to
genetic drift, we propose a parameter-less version of the compact genetic
algorithm that automatically finds a suitable population size without spending
too much time in situations unfavorable due to genetic drift.
We prove a mathematical runtime guarantee for this algorithm and conduct an
extensive experimental analysis on four classic benchmark problems both without
and with additive centered Gaussian posterior noise. The former shows that
under a natural assumption, our algorithm has a performance very similar to the
one obtainable from the best problem-specific population size. The latter
confirms that missing the right population size in the original cGA can be
detrimental and that previous theory-based suggestions for the population size
can be far away from the right values; it also shows that our algorithm as well
as a previously proposed parameter-less variant of the cGA based on parallel
runs avoid such pitfalls. Comparing the two parameter-less approaches, ours
profits from its ability to abort runs which are likely to be stuck in a
genetic drift situation.Comment: 4 figures. Extended version of a paper appearing at GECCO 202
A Tight Runtime Analysis for the cGA on Jump Functions---EDAs Can Cross Fitness Valleys at No Extra Cost
We prove that the compact genetic algorithm (cGA) with hypothetical
population size with high
probability finds the optimum of any -dimensional jump function with jump
size in iterations. Since it is known
that the cGA with high probability needs at least iterations to optimize the unimodal OneMax function, our result shows that
the cGA in contrast to most classic evolutionary algorithms here is able to
cross moderate-sized valleys of low fitness at no extra cost.
Our runtime guarantee improves over the recent upper bound valid for of Hasen\"ohrl and
Sutton (GECCO 2018). For the best choice of the hypothetical population size,
this result gives a runtime guarantee of , whereas ours
gives .
We also provide a simple general method based on parallel runs that, under
mild conditions, (i)~overcomes the need to specify a suitable population size,
but gives a performance close to the one stemming from the best-possible
population size, and (ii)~transforms EDAs with high-probability performance
guarantees into EDAs with similar bounds on the expected runtime.Comment: 25 pages, full version of a paper to appear at GECCO 201
Self-Adjusting Evolutionary Algorithms for Multimodal Optimization
Recent theoretical research has shown that self-adjusting and self-adaptive
mechanisms can provably outperform static settings in evolutionary algorithms
for binary search spaces. However, the vast majority of these studies focuses
on unimodal functions which do not require the algorithm to flip several bits
simultaneously to make progress. In fact, existing self-adjusting algorithms
are not designed to detect local optima and do not have any obvious benefit to
cross large Hamming gaps.
We suggest a mechanism called stagnation detection that can be added as a
module to existing evolutionary algorithms (both with and without prior
self-adjusting algorithms). Added to a simple (1+1) EA, we prove an expected
runtime on the well-known Jump benchmark that corresponds to an asymptotically
optimal parameter setting and outperforms other mechanisms for multimodal
optimization like heavy-tailed mutation. We also investigate the module in the
context of a self-adjusting (1+) EA and show that it combines the
previous benefits of this algorithm on unimodal problems with more efficient
multimodal optimization.
To explore the limitations of the approach, we additionally present an
example where both self-adjusting mechanisms, including stagnation detection,
do not help to find a beneficial setting of the mutation rate. Finally, we
investigate our module for stagnation detection experimentally.Comment: 26 pages. Full version of a paper appearing at GECCO 202
On the limitations of the univariate marginal distribution algorithm to deception and where bivariate EDAs might help
We introduce a new benchmark problem called Deceptive Leading Blocks (DLB) to
rigorously study the runtime of the Univariate Marginal Distribution Algorithm
(UMDA) in the presence of epistasis and deception. We show that simple
Evolutionary Algorithms (EAs) outperform the UMDA unless the selective pressure
is extremely high, where and are the parent and
offspring population sizes, respectively. More precisely, we show that the UMDA
with a parent population size of has an expected runtime
of on the DLB problem assuming any selective pressure
, as opposed to the expected runtime
of for the non-elitist
with . These results illustrate
inherent limitations of univariate EDAs against deception and epistasis, which
are common characteristics of real-world problems. In contrast, empirical
evidence reveals the efficiency of the bi-variate MIMIC algorithm on the DLB
problem. Our results suggest that one should consider EDAs with more complex
probabilistic models when optimising problems with some degree of epistasis and
deception.Comment: To appear in the 15th ACM/SIGEVO Workshop on Foundations of Genetic
Algorithms (FOGA XV), Potsdam, German
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that
search for optimal solutions by learning and sampling from probabilistic
models. Despite their popularity in real-world applications, there is little
rigorous understanding of their performance. Even for the Univariate Marginal
Distribution Algorithm (UMDA) -- a simple population-based EDA assuming
independence between decision variables -- the optimisation time on the linear
problem OneMax was until recently undetermined. The incomplete theoretical
understanding of EDAs is mainly due to lack of appropriate analytical tools.
We show that the recently developed level-based theorem for non-elitist
populations combined with anti-concentration results yield upper bounds on the
expected optimisation time of the UMDA. This approach results in the bound
on two problems, LeadingOnes and
BinVal, for population sizes , where and
are parameters of the algorithm. We also prove that the UMDA with
population sizes optimises
OneMax in expected time , and for larger population
sizes , in expected time
. The facility and generality of our arguments
suggest that this is a promising approach to derive bounds on the expected
optimisation time of EDAs.Comment: To appear in Algorithmica Journa
Simple hyper-heuristics control the neighbourhood size of randomised local search optimally for LeadingOnes
Selection hyper-heuristics (HHs) are randomised search methodologies which choose and execute heuristics during the optimisation process from a set of low-level heuristics. A machine learning mechanism is generally used to decide which low-level heuristic should be applied in each decision step. In this paper we analyse whether sophisticated learning mechanisms are always necessary for HHs to perform well. To this end we consider the most simple HHs from the literature and rigorously analyse their performance for the LeadingOnes benchmark function. Our analysis shows that the standard Simple Random, Permutation, Greedy and Random Gradient HHs show no signs of learning. While the former HHs do not attempt to learn from the past performance of low-level heuristics, the idea behind the Random Gradient HH is to continue to exploit the currently selected heuristic as long as it is successful. Hence, it is embedded with a reinforcement learning mechanism with the shortest possible memory. However, the probability that a promising heuristic is successful in the next step is relatively low when perturbing a reasonable solution to a combinatorial optimisation problem. We generalise the `simple' Random Gradient HH so success can be measured over a fixed period of time τ, instead of a single iteration. For LeadingOnes we prove that the Generalised Random Gradient (GRG) HH can learn to adapt the neighbourhood size of Randomised Local Search to optimality during the run. As a result, we prove it has the best possible performance achievable with the low-level heuristics (Randomised Local Search with different neighbourhood sizes), up to lower order terms. We also prove that the performance of the HH improves as the number of low-level local search heuristics to choose from increases. In particular, with access to k low-level local search heuristics, it outperforms the best-possible algorithm using any subset of the k heuristics. Finally, we show that the advantages of GRG over Randomised Local Search and Evolutionary Algorithms using standard bit mutation increase if the anytime performance is considered (i.e., the performance gap is larger if approximate solutions are sought rather than exact ones). Experimental analyses confirm these results for different problem sizes (up to n = 108) and shed some light on the best choices for the parameter τ in various situations