48 research outputs found
Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration
Unlike traditional evolutionary algorithms which produce offspring via
genetic operators, Estimation of Distribution Algorithms (EDAs) sample
solutions from probabilistic models which are learned from selected
individuals. It is hoped that EDAs may improve optimisation performance on
epistatic fitness landscapes by learning variable interactions. However, hardly
any rigorous results are available to support claims about the performance of
EDAs, even for fitness functions without epistasis. The expected runtime of the
Univariate Marginal Distribution Algorithm (UMDA) on OneMax was recently shown
to be in by Dang and Lehre
(GECCO 2015). Later, Krejca and Witt (FOGA 2017) proved the lower bound
via an involved drift analysis.
We prove a bound, given some restrictions
on the population size. This implies the tight bound when , matching the runtime
of classical EAs. Our analysis uses the level-based theorem and
anti-concentration properties of the Poisson-Binomial distribution. We expect
that these generic methods will facilitate further analysis of EDAs.Comment: 19 pages, 1 figur
Upper Bounds on the Runtime of the Univariate Marginal Distribution Algorithm on OneMax
A runtime analysis of the Univariate Marginal Distribution Algorithm (UMDA)
is presented on the OneMax function for wide ranges of its parameters and
. If for some constant and
, a general bound on the expected runtime
is obtained. This bound crucially assumes that all marginal probabilities of
the algorithm are confined to the interval . If for a constant and , the
behavior of the algorithm changes and the bound on the expected runtime becomes
, which typically even holds if the borders on the marginal
probabilities are omitted.
The results supplement the recently derived lower bound
by Krejca and Witt (FOGA 2017) and turn out as
tight for the two very different values and . They also improve the previously best known upper bound by Dang and Lehre (GECCO 2015).Comment: Version 4: added illustrations and experiments; improved presentation
in Section 2.2; to appear in Algorithmica; the final publication is available
at Springer via http://dx.doi.org/10.1007/s00453-018-0463-
Level-Based Analysis of the Population-Based Incremental Learning Algorithm
The Population-Based Incremental Learning (PBIL) algorithm uses a convex
combination of the current model and the empirical model to construct the next
model, which is then sampled to generate offspring. The Univariate Marginal
Distribution Algorithm (UMDA) is a special case of the PBIL, where the current
model is ignored. Dang and Lehre (GECCO 2015) showed that UMDA can optimise
LeadingOnes efficiently. The question still remained open if the PBIL performs
equally well. Here, by applying the level-based theorem in addition to
Dvoretzky--Kiefer--Wolfowitz inequality, we show that the PBIL optimises
function LeadingOnes in expected time for a population size , which matches the bound
of the UMDA. Finally, we show that the result carries over to BinVal, giving
the fist runtime result for the PBIL on the BinVal problem.Comment: To appea
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that
search for optimal solutions by learning and sampling from probabilistic
models. Despite their popularity in real-world applications, there is little
rigorous understanding of their performance. Even for the Univariate Marginal
Distribution Algorithm (UMDA) -- a simple population-based EDA assuming
independence between decision variables -- the optimisation time on the linear
problem OneMax was until recently undetermined. The incomplete theoretical
understanding of EDAs is mainly due to lack of appropriate analytical tools.
We show that the recently developed level-based theorem for non-elitist
populations combined with anti-concentration results yield upper bounds on the
expected optimisation time of the UMDA. This approach results in the bound
on two problems, LeadingOnes and
BinVal, for population sizes , where and
are parameters of the algorithm. We also prove that the UMDA with
population sizes optimises
OneMax in expected time , and for larger population
sizes , in expected time
. The facility and generality of our arguments
suggest that this is a promising approach to derive bounds on the expected
optimisation time of EDAs.Comment: To appear in Algorithmica Journa
From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms
Estimation-of-distribution algorithms (EDAs) are optimization algorithms that
learn a distribution on the search space from which good solutions can be
sampled easily. A key parameter of most EDAs is the sample size (population
size). If the population size is too small, the update of the probabilistic
model builds on few samples, leading to the undesired effect of genetic drift.
Too large population sizes avoid genetic drift, but slow down the process.
Building on a recent quantitative analysis of how the population size leads
to genetic drift, we design a smart-restart mechanism for EDAs. By stopping
runs when the risk for genetic drift is high, it automatically runs the EDA in
good parameter regimes.
Via a mathematical runtime analysis, we prove a general performance guarantee
for this smart-restart scheme. This in particular shows that in many situations
where the optimal (problem-specific) parameter values are known, the restart
scheme automatically finds these, leading to the asymptotically optimal
performance.
We also conduct an extensive experimental analysis. On four classic benchmark
problems, we clearly observe the critical influence of the population size on
the performance, and we find that the smart-restart scheme leads to a
performance close to the one obtainable with optimal parameter values. Our
results also show that previous theory-based suggestions for the optimal
population size can be far from the optimal ones, leading to a performance
clearly inferior to the one obtained via the smart-restart scheme. We also
conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial
optimization problems from the literature, the max-cut problem and the
bipartition problem. Again, we observe that the smart-restart mechanism finds
much better values for the population size than those suggested in the
literature, leading to a much better performance.Comment: Accepted for publication in "Journal of Machine Learning Research".
Extended version of our GECCO 2020 paper. This article supersedes
arXiv:2004.0714
Sharp Bounds for Genetic Drift in EDAs
Estimation of Distribution Algorithms (EDAs) are one branch of Evolutionary
Algorithms (EAs) in the broad sense that they evolve a probabilistic model
instead of a population. Many existing algorithms fall into this category.
Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that
updates of the probabilistic model not justified by the fitness move the
sampling frequencies to the boundary values. This can result in a considerable
performance loss.
This paper proves the first sharp estimates of the boundary hitting time of
the sampling frequency of a neutral bit for several univariate EDAs. For the
UMDA that selects best individuals from offspring each
generation, we prove that the expected first iteration when the frequency of
the neutral bit leaves the middle range and the
expected first time it is absorbed in 0 or 1 are both . The
corresponding hitting times are for the cGA with hypothetical
population size . This paper further proves that for PBIL with parameters
, , and , in an expected number of
iterations the sampling frequency of a neutral bit leaves the interval
and then always the same value is
sampled for this bit, that is, the frequency approaches the corresponding
boundary value with maximum speed.
For the lower bounds implicit in these statements, we also show exponential
tail bounds. If a bit is not neutral, but neutral or has a preference for ones,
then the lower bounds on the times to reach a low frequency value still hold.
An analogous statement holds for bits that are neutral or prefer the value
zero
An Exponential Lower Bound for the Runtime of the cGA on Jump Functions
In the first runtime analysis of an estimation-of-distribution algorithm
(EDA) on the multi-modal jump function class, Hasen\"ohrl and Sutton (GECCO
2018) proved that the runtime of the compact genetic algorithm with suitable
parameter choice on jump functions with high probability is at most polynomial
(in the dimension) if the jump size is at most logarithmic (in the dimension),
and is at most exponential in the jump size if the jump size is
super-logarithmic. The exponential runtime guarantee was achieved with a
hypothetical population size that is also exponential in the jump size.
Consequently, this setting cannot lead to a better runtime.
In this work, we show that any choice of the hypothetical population size
leads to a runtime that, with high probability, is at least exponential in the
jump size. This result might be the first non-trivial exponential lower bound
for EDAs that holds for arbitrary parameter settings.Comment: To appear in the Proceedings of FOGA 2019. arXiv admin note: text
overlap with arXiv:1903.1098