19 research outputs found
An Exponential Lower Bound for the Runtime of the cGA on Jump Functions
In the first runtime analysis of an estimation-of-distribution algorithm
(EDA) on the multi-modal jump function class, Hasen\"ohrl and Sutton (GECCO
2018) proved that the runtime of the compact genetic algorithm with suitable
parameter choice on jump functions with high probability is at most polynomial
(in the dimension) if the jump size is at most logarithmic (in the dimension),
and is at most exponential in the jump size if the jump size is
super-logarithmic. The exponential runtime guarantee was achieved with a
hypothetical population size that is also exponential in the jump size.
Consequently, this setting cannot lead to a better runtime.
In this work, we show that any choice of the hypothetical population size
leads to a runtime that, with high probability, is at least exponential in the
jump size. This result might be the first non-trivial exponential lower bound
for EDAs that holds for arbitrary parameter settings.Comment: To appear in the Proceedings of FOGA 2019. arXiv admin note: text
overlap with arXiv:1903.1098
A Simplified Run Time Analysis of the Univariate Marginal Distribution Algorithm on LeadingOnes
With elementary means, we prove a stronger run time guarantee for the
univariate marginal distribution algorithm (UMDA) optimizing the LeadingOnes
benchmark function in the desirable regime with low genetic drift. If the
population size is at least quasilinear, then, with high probability, the UMDA
samples the optimum within a number of iterations that is linear in the problem
size divided by the logarithm of the UMDA's selection rate. This improves over
the previous guarantee, obtained by Dang and Lehre (2015) via the deep
level-based population method, both in terms of the run time and by
demonstrating further run time gains from small selection rates. With similar
arguments as in our upper-bound analysis, we also obtain the first lower bound
for this problem. Under similar assumptions, we prove that a bound that matches
our upper bound up to constant factors holds with high probability
Sharp Bounds for Genetic Drift in EDAs
Estimation of Distribution Algorithms (EDAs) are one branch of Evolutionary
Algorithms (EAs) in the broad sense that they evolve a probabilistic model
instead of a population. Many existing algorithms fall into this category.
Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that
updates of the probabilistic model not justified by the fitness move the
sampling frequencies to the boundary values. This can result in a considerable
performance loss.
This paper proves the first sharp estimates of the boundary hitting time of
the sampling frequency of a neutral bit for several univariate EDAs. For the
UMDA that selects best individuals from offspring each
generation, we prove that the expected first iteration when the frequency of
the neutral bit leaves the middle range and the
expected first time it is absorbed in 0 or 1 are both . The
corresponding hitting times are for the cGA with hypothetical
population size . This paper further proves that for PBIL with parameters
, , and , in an expected number of
iterations the sampling frequency of a neutral bit leaves the interval
and then always the same value is
sampled for this bit, that is, the frequency approaches the corresponding
boundary value with maximum speed.
For the lower bounds implicit in these statements, we also show exponential
tail bounds. If a bit is not neutral, but neutral or has a preference for ones,
then the lower bounds on the times to reach a low frequency value still hold.
An analogous statement holds for bits that are neutral or prefer the value
zero
From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm
One of the key difficulties in using estimation-of-distribution algorithms is
choosing the population size(s) appropriately: Too small values lead to genetic
drift, which can cause enormous difficulties. In the regime with no genetic
drift, however, often the runtime is roughly proportional to the population
size, which renders large population sizes inefficient.
Based on a recent quantitative analysis which population sizes lead to
genetic drift, we propose a parameter-less version of the compact genetic
algorithm that automatically finds a suitable population size without spending
too much time in situations unfavorable due to genetic drift.
We prove a mathematical runtime guarantee for this algorithm and conduct an
extensive experimental analysis on four classic benchmark problems both without
and with additive centered Gaussian posterior noise. The former shows that
under a natural assumption, our algorithm has a performance very similar to the
one obtainable from the best problem-specific population size. The latter
confirms that missing the right population size in the original cGA can be
detrimental and that previous theory-based suggestions for the population size
can be far away from the right values; it also shows that our algorithm as well
as a previously proposed parameter-less variant of the cGA based on parallel
runs avoid such pitfalls. Comparing the two parameter-less approaches, ours
profits from its ability to abort runs which are likely to be stuck in a
genetic drift situation.Comment: 4 figures. Extended version of a paper appearing at GECCO 202
The Univariate Marginal Distribution Algorithm Copes Well With Deception and Epistasis
In their recent work, Lehre and Nguyen (FOGA 2019) show that the univariate
marginal distribution algorithm (UMDA) needs time exponential in the parent
populations size to optimize the DeceptiveLeadingBlocks (DLB) problem. They
conclude from this result that univariate EDAs have difficulties with deception
and epistasis.
In this work, we show that this negative finding is caused by an unfortunate
choice of the parameters of the UMDA. When the population sizes are chosen
large enough to prevent genetic drift, then the UMDA optimizes the DLB problem
with high probability with at most fitness
evaluations. Since an offspring population size of order
can prevent genetic drift, the UMDA can solve the DLB problem with fitness evaluations. In contrast, for classic evolutionary algorithms no
better run time guarantee than is known (which we prove to be tight
for the EA), so our result rather suggests that the UMDA can cope
well with deception and epistatis.
From a broader perspective, our result shows that the UMDA can cope better
with local optima than evolutionary algorithms; such a result was previously
known only for the compact genetic algorithm. Together with the lower bound of
Lehre and Nguyen, our result for the first time rigorously proves that running
EDAs in the regime with genetic drift can lead to drastic performance losses
A Tight Runtime Analysis for the cGA on Jump Functions---EDAs Can Cross Fitness Valleys at No Extra Cost
We prove that the compact genetic algorithm (cGA) with hypothetical
population size with high
probability finds the optimum of any -dimensional jump function with jump
size in iterations. Since it is known
that the cGA with high probability needs at least iterations to optimize the unimodal OneMax function, our result shows that
the cGA in contrast to most classic evolutionary algorithms here is able to
cross moderate-sized valleys of low fitness at no extra cost.
Our runtime guarantee improves over the recent upper bound valid for of Hasen\"ohrl and
Sutton (GECCO 2018). For the best choice of the hypothetical population size,
this result gives a runtime guarantee of , whereas ours
gives .
We also provide a simple general method based on parallel runs that, under
mild conditions, (i)~overcomes the need to specify a suitable population size,
but gives a performance close to the one stemming from the best-possible
population size, and (ii)~transforms EDAs with high-probability performance
guarantees into EDAs with similar bounds on the expected runtime.Comment: 25 pages, full version of a paper to appear at GECCO 201
Runtime analysis of the univariate marginal distribution algorithm under low selective pressure and prior noise
We perform a rigorous runtime analysis for the Univariate Marginal
Distribution Algorithm on the LeadingOnes function, a well-known benchmark
function in the theory community of evolutionary computation with a high
correlation between decision variables. For a problem instance of size , the
currently best known upper bound on the expected runtime is
(Dang and Lehre, GECCO 2015), while a
lower bound necessary to understand how the algorithm copes with variable
dependencies is still missing. Motivated by this, we show that the algorithm
requires a runtime with high probability and in expectation
if the selective pressure is low; otherwise, we obtain a lower bound of
on the expected runtime.
Furthermore, we for the first time consider the algorithm on the function under
a prior noise model and obtain an expected runtime for the
optimal parameter settings. In the end, our theoretical results are accompanied
by empirical findings, not only matching with rigorous analyses but also
providing new insights into the behaviour of the algorithm.Comment: To appear at GECCO 2019, Prague, Czech Republi
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that
search for optimal solutions by learning and sampling from probabilistic
models. Despite their popularity in real-world applications, there is little
rigorous understanding of their performance. Even for the Univariate Marginal
Distribution Algorithm (UMDA) -- a simple population-based EDA assuming
independence between decision variables -- the optimisation time on the linear
problem OneMax was until recently undetermined. The incomplete theoretical
understanding of EDAs is mainly due to lack of appropriate analytical tools.
We show that the recently developed level-based theorem for non-elitist
populations combined with anti-concentration results yield upper bounds on the
expected optimisation time of the UMDA. This approach results in the bound
on two problems, LeadingOnes and
BinVal, for population sizes , where and
are parameters of the algorithm. We also prove that the UMDA with
population sizes optimises
OneMax in expected time , and for larger population
sizes , in expected time
. The facility and generality of our arguments
suggest that this is a promising approach to derive bounds on the expected
optimisation time of EDAs.Comment: To appear in Algorithmica Journa
Language Model Crossover: Variation through Few-Shot Prompting
This paper pursues the insight that language models naturally enable an
intelligent variation operator similar in spirit to evolutionary crossover. In
particular, language models of sufficient scale demonstrate in-context
learning, i.e. they can learn from associations between a small number of input
patterns to generate outputs incorporating such associations (also called
few-shot prompting). This ability can be leveraged to form a simple but
powerful variation operator, i.e. to prompt a language model with a few
text-based genotypes (such as code, plain-text sentences, or equations), and to
parse its corresponding output as those genotypes' offspring. The promise of
such language model crossover (which is simple to implement and can leverage
many different open-source language models) is that it enables a simple
mechanism to evolve semantically-rich text representations (with few
domain-specific tweaks), and naturally benefits from current progress in
language models. Experiments in this paper highlight the versatility of
language-model crossover, through evolving binary bit-strings, sentences,
equations, text-to-image prompts, and Python code. The conclusion is that
language model crossover is a promising method for evolving genomes
representable as text