56 research outputs found

    Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration

    Get PDF
    Unlike traditional evolutionary algorithms which produce offspring via genetic operators, Estimation of Distribution Algorithms (EDAs) sample solutions from probabilistic models which are learned from selected individuals. It is hoped that EDAs may improve optimisation performance on epistatic fitness landscapes by learning variable interactions. However, hardly any rigorous results are available to support claims about the performance of EDAs, even for fitness functions without epistasis. The expected runtime of the Univariate Marginal Distribution Algorithm (UMDA) on OneMax was recently shown to be in O(nλlogλ)\mathcal{O}\left(n\lambda\log \lambda\right) by Dang and Lehre (GECCO 2015). Later, Krejca and Witt (FOGA 2017) proved the lower bound Ω(λn+nlogn)\Omega\left(\lambda\sqrt{n}+n\log n\right) via an involved drift analysis. We prove a O(nλ)\mathcal{O}\left(n\lambda\right) bound, given some restrictions on the population size. This implies the tight bound Θ(nlogn)\Theta\left(n\log n\right) when λ=O(logn)\lambda=\mathcal{O}\left(\log n\right), matching the runtime of classical EAs. Our analysis uses the level-based theorem and anti-concentration properties of the Poisson-Binomial distribution. We expect that these generic methods will facilitate further analysis of EDAs.Comment: 19 pages, 1 figur

    Upper Bounds on the Runtime of the Univariate Marginal Distribution Algorithm on OneMax

    Full text link
    A runtime analysis of the Univariate Marginal Distribution Algorithm (UMDA) is presented on the OneMax function for wide ranges of its parameters μ\mu and λ\lambda. If μclogn\mu\ge c\log n for some constant c>0c>0 and λ=(1+Θ(1))μ\lambda=(1+\Theta(1))\mu, a general bound O(μn)O(\mu n) on the expected runtime is obtained. This bound crucially assumes that all marginal probabilities of the algorithm are confined to the interval [1/n,11/n][1/n,1-1/n]. If μcnlogn\mu\ge c' \sqrt{n}\log n for a constant c>0c'>0 and λ=(1+Θ(1))μ\lambda=(1+\Theta(1))\mu, the behavior of the algorithm changes and the bound on the expected runtime becomes O(μn)O(\mu\sqrt{n}), which typically even holds if the borders on the marginal probabilities are omitted. The results supplement the recently derived lower bound Ω(μn+nlogn)\Omega(\mu\sqrt{n}+n\log n) by Krejca and Witt (FOGA 2017) and turn out as tight for the two very different values μ=clogn\mu=c\log n and μ=cnlogn\mu=c'\sqrt{n}\log n. They also improve the previously best known upper bound O(nlognloglogn)O(n\log n\log\log n) by Dang and Lehre (GECCO 2015).Comment: Version 4: added illustrations and experiments; improved presentation in Section 2.2; to appear in Algorithmica; the final publication is available at Springer via http://dx.doi.org/10.1007/s00453-018-0463-

    Level-Based Analysis of the Population-Based Incremental Learning Algorithm

    Get PDF
    The Population-Based Incremental Learning (PBIL) algorithm uses a convex combination of the current model and the empirical model to construct the next model, which is then sampled to generate offspring. The Univariate Marginal Distribution Algorithm (UMDA) is a special case of the PBIL, where the current model is ignored. Dang and Lehre (GECCO 2015) showed that UMDA can optimise LeadingOnes efficiently. The question still remained open if the PBIL performs equally well. Here, by applying the level-based theorem in addition to Dvoretzky--Kiefer--Wolfowitz inequality, we show that the PBIL optimises function LeadingOnes in expected time O(nλlogλ+n2)\mathcal{O}(n\lambda \log \lambda + n^2) for a population size λ=Ω(logn)\lambda = \Omega(\log n), which matches the bound of the UMDA. Finally, we show that the result carries over to BinVal, giving the fist runtime result for the PBIL on the BinVal problem.Comment: To appea

    Level-Based Analysis of the Univariate Marginal Distribution Algorithm

    Get PDF
    Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that search for optimal solutions by learning and sampling from probabilistic models. Despite their popularity in real-world applications, there is little rigorous understanding of their performance. Even for the Univariate Marginal Distribution Algorithm (UMDA) -- a simple population-based EDA assuming independence between decision variables -- the optimisation time on the linear problem OneMax was until recently undetermined. The incomplete theoretical understanding of EDAs is mainly due to lack of appropriate analytical tools. We show that the recently developed level-based theorem for non-elitist populations combined with anti-concentration results yield upper bounds on the expected optimisation time of the UMDA. This approach results in the bound O(nλlogλ+n2)\mathcal{O}(n\lambda\log \lambda+n^2) on two problems, LeadingOnes and BinVal, for population sizes λ>μ=Ω(logn)\lambda>\mu=\Omega(\log n), where μ\mu and λ\lambda are parameters of the algorithm. We also prove that the UMDA with population sizes μO(n)Ω(logn)\mu\in \mathcal{O}(\sqrt{n}) \cap \Omega(\log n) optimises OneMax in expected time O(λn)\mathcal{O}(\lambda n), and for larger population sizes μ=Ω(nlogn)\mu=\Omega(\sqrt{n}\log n), in expected time O(λn)\mathcal{O}(\lambda\sqrt{n}). The facility and generality of our arguments suggest that this is a promising approach to derive bounds on the expected optimisation time of EDAs.Comment: To appear in Algorithmica Journa

    From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms

    Full text link
    Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large population sizes avoid genetic drift, but slow down the process. Building on a recent quantitative analysis of how the population size leads to genetic drift, we design a smart-restart mechanism for EDAs. By stopping runs when the risk for genetic drift is high, it automatically runs the EDA in good parameter regimes. Via a mathematical runtime analysis, we prove a general performance guarantee for this smart-restart scheme. This in particular shows that in many situations where the optimal (problem-specific) parameter values are known, the restart scheme automatically finds these, leading to the asymptotically optimal performance. We also conduct an extensive experimental analysis. On four classic benchmark problems, we clearly observe the critical influence of the population size on the performance, and we find that the smart-restart scheme leads to a performance close to the one obtainable with optimal parameter values. Our results also show that previous theory-based suggestions for the optimal population size can be far from the optimal ones, leading to a performance clearly inferior to the one obtained via the smart-restart scheme. We also conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial optimization problems from the literature, the max-cut problem and the bipartition problem. Again, we observe that the smart-restart mechanism finds much better values for the population size than those suggested in the literature, leading to a much better performance.Comment: Accepted for publication in "Journal of Machine Learning Research". Extended version of our GECCO 2020 paper. This article supersedes arXiv:2004.0714

    From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm

    Full text link
    One of the key difficulties in using estimation-of-distribution algorithms is choosing the population size(s) appropriately: Too small values lead to genetic drift, which can cause enormous difficulties. In the regime with no genetic drift, however, often the runtime is roughly proportional to the population size, which renders large population sizes inefficient. Based on a recent quantitative analysis which population sizes lead to genetic drift, we propose a parameter-less version of the compact genetic algorithm that automatically finds a suitable population size without spending too much time in situations unfavorable due to genetic drift. We prove a mathematical runtime guarantee for this algorithm and conduct an extensive experimental analysis on four classic benchmark problems both without and with additive centered Gaussian posterior noise. The former shows that under a natural assumption, our algorithm has a performance very similar to the one obtainable from the best problem-specific population size. The latter confirms that missing the right population size in the original cGA can be detrimental and that previous theory-based suggestions for the population size can be far away from the right values; it also shows that our algorithm as well as a previously proposed parameter-less variant of the cGA based on parallel runs avoid such pitfalls. Comparing the two parameter-less approaches, ours profits from its ability to abort runs which are likely to be stuck in a genetic drift situation.Comment: 4 figures. Extended version of a paper appearing at GECCO 202

    An Exponential Lower Bound for the Runtime of the cGA on Jump Functions

    Full text link
    In the first runtime analysis of an estimation-of-distribution algorithm (EDA) on the multi-modal jump function class, Hasen\"ohrl and Sutton (GECCO 2018) proved that the runtime of the compact genetic algorithm with suitable parameter choice on jump functions with high probability is at most polynomial (in the dimension) if the jump size is at most logarithmic (in the dimension), and is at most exponential in the jump size if the jump size is super-logarithmic. The exponential runtime guarantee was achieved with a hypothetical population size that is also exponential in the jump size. Consequently, this setting cannot lead to a better runtime. In this work, we show that any choice of the hypothetical population size leads to a runtime that, with high probability, is at least exponential in the jump size. This result might be the first non-trivial exponential lower bound for EDAs that holds for arbitrary parameter settings.Comment: To appear in the Proceedings of FOGA 2019. arXiv admin note: text overlap with arXiv:1903.1098
    corecore