1,828 research outputs found
Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration
Unlike traditional evolutionary algorithms which produce offspring via
genetic operators, Estimation of Distribution Algorithms (EDAs) sample
solutions from probabilistic models which are learned from selected
individuals. It is hoped that EDAs may improve optimisation performance on
epistatic fitness landscapes by learning variable interactions. However, hardly
any rigorous results are available to support claims about the performance of
EDAs, even for fitness functions without epistasis. The expected runtime of the
Univariate Marginal Distribution Algorithm (UMDA) on OneMax was recently shown
to be in by Dang and Lehre
(GECCO 2015). Later, Krejca and Witt (FOGA 2017) proved the lower bound
via an involved drift analysis.
We prove a bound, given some restrictions
on the population size. This implies the tight bound when , matching the runtime
of classical EAs. Our analysis uses the level-based theorem and
anti-concentration properties of the Poisson-Binomial distribution. We expect
that these generic methods will facilitate further analysis of EDAs.Comment: 19 pages, 1 figur
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that
search for optimal solutions by learning and sampling from probabilistic
models. Despite their popularity in real-world applications, there is little
rigorous understanding of their performance. Even for the Univariate Marginal
Distribution Algorithm (UMDA) -- a simple population-based EDA assuming
independence between decision variables -- the optimisation time on the linear
problem OneMax was until recently undetermined. The incomplete theoretical
understanding of EDAs is mainly due to lack of appropriate analytical tools.
We show that the recently developed level-based theorem for non-elitist
populations combined with anti-concentration results yield upper bounds on the
expected optimisation time of the UMDA. This approach results in the bound
on two problems, LeadingOnes and
BinVal, for population sizes , where and
are parameters of the algorithm. We also prove that the UMDA with
population sizes optimises
OneMax in expected time , and for larger population
sizes , in expected time
. The facility and generality of our arguments
suggest that this is a promising approach to derive bounds on the expected
optimisation time of EDAs.Comment: To appear in Algorithmica Journa
Semiparametric Multivariate Accelerated Failure Time Model with Generalized Estimating Equations
The semiparametric accelerated failure time model is not as widely used as
the Cox relative risk model mainly due to computational difficulties. Recent
developments in least squares estimation and induced smoothing estimating
equations provide promising tools to make the accelerate failure time models
more attractive in practice. For semiparametric multivariate accelerated
failure time models, we propose a generalized estimating equation approach to
account for the multivariate dependence through working correlation structures.
The marginal error distributions can be either identical as in sequential event
settings or different as in parallel event settings. Some regression
coefficients can be shared across margins as needed. The initial estimator is a
rank-based estimator with Gehan's weight, but obtained from an induced
smoothing approach with computation ease. The resulting estimator is consistent
and asymptotically normal, with a variance estimated through a multiplier
resampling method. In a simulation study, our estimator was up to three times
as efficient as the initial estimator, especially with stronger multivariate
dependence and heavier censoring percentage. Two real examples demonstrate the
utility of the proposed method
From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms
Estimation-of-distribution algorithms (EDAs) are optimization algorithms that
learn a distribution on the search space from which good solutions can be
sampled easily. A key parameter of most EDAs is the sample size (population
size). If the population size is too small, the update of the probabilistic
model builds on few samples, leading to the undesired effect of genetic drift.
Too large population sizes avoid genetic drift, but slow down the process.
Building on a recent quantitative analysis of how the population size leads
to genetic drift, we design a smart-restart mechanism for EDAs. By stopping
runs when the risk for genetic drift is high, it automatically runs the EDA in
good parameter regimes.
Via a mathematical runtime analysis, we prove a general performance guarantee
for this smart-restart scheme. This in particular shows that in many situations
where the optimal (problem-specific) parameter values are known, the restart
scheme automatically finds these, leading to the asymptotically optimal
performance.
We also conduct an extensive experimental analysis. On four classic benchmark
problems, we clearly observe the critical influence of the population size on
the performance, and we find that the smart-restart scheme leads to a
performance close to the one obtainable with optimal parameter values. Our
results also show that previous theory-based suggestions for the optimal
population size can be far from the optimal ones, leading to a performance
clearly inferior to the one obtained via the smart-restart scheme. We also
conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial
optimization problems from the literature, the max-cut problem and the
bipartition problem. Again, we observe that the smart-restart mechanism finds
much better values for the population size than those suggested in the
literature, leading to a much better performance.Comment: Accepted for publication in "Journal of Machine Learning Research".
Extended version of our GECCO 2020 paper. This article supersedes
arXiv:2004.0714
On the limitations of the univariate marginal distribution algorithm to deception and where bivariate EDAs might help
We introduce a new benchmark problem called Deceptive Leading Blocks (DLB) to
rigorously study the runtime of the Univariate Marginal Distribution Algorithm
(UMDA) in the presence of epistasis and deception. We show that simple
Evolutionary Algorithms (EAs) outperform the UMDA unless the selective pressure
is extremely high, where and are the parent and
offspring population sizes, respectively. More precisely, we show that the UMDA
with a parent population size of has an expected runtime
of on the DLB problem assuming any selective pressure
, as opposed to the expected runtime
of for the non-elitist
with . These results illustrate
inherent limitations of univariate EDAs against deception and epistasis, which
are common characteristics of real-world problems. In contrast, empirical
evidence reveals the efficiency of the bi-variate MIMIC algorithm on the DLB
problem. Our results suggest that one should consider EDAs with more complex
probabilistic models when optimising problems with some degree of epistasis and
deception.Comment: To appear in the 15th ACM/SIGEVO Workshop on Foundations of Genetic
Algorithms (FOGA XV), Potsdam, German
Upper Bounds on the Runtime of the Univariate Marginal Distribution Algorithm on OneMax
A runtime analysis of the Univariate Marginal Distribution Algorithm (UMDA)
is presented on the OneMax function for wide ranges of its parameters and
. If for some constant and
, a general bound on the expected runtime
is obtained. This bound crucially assumes that all marginal probabilities of
the algorithm are confined to the interval . If for a constant and , the
behavior of the algorithm changes and the bound on the expected runtime becomes
, which typically even holds if the borders on the marginal
probabilities are omitted.
The results supplement the recently derived lower bound
by Krejca and Witt (FOGA 2017) and turn out as
tight for the two very different values and . They also improve the previously best known upper bound by Dang and Lehre (GECCO 2015).Comment: Version 4: added illustrations and experiments; improved presentation
in Section 2.2; to appear in Algorithmica; the final publication is available
at Springer via http://dx.doi.org/10.1007/s00453-018-0463-
Bounds on Integrals with Respect to Multivariate Copulas
Finding upper and lower bounds to integrals with respect to copulas is a
quite prominent problem in applied probability. In their 2014 paper, Hofer and
Iaco showed how particular two dimensional copulas are related to optimal
solutions of the two dimensional assignment problem. Using this, they managed
to approximate integrals with respect to two dimensional copulas. In this
paper, we will further illuminate this connection, extend it to d-dimensional
copulas and therefore generalize the method from Hofer and Iaco to arbitrary
dimensions. We also provide convergence statements. As an example, we consider
three dimensional dependence measures
- …