1,309 research outputs found
Maximum Likelihood-based Online Adaptation of Hyper-parameters in CMA-ES
The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely
accepted as a robust derivative-free continuous optimization algorithm for
non-linear and non-convex optimization problems. CMA-ES is well known to be
almost parameterless, meaning that only one hyper-parameter, the population
size, is proposed to be tuned by the user. In this paper, we propose a
principled approach called self-CMA-ES to achieve the online adaptation of
CMA-ES hyper-parameters in order to improve its overall performance.
Experimental results show that for larger-than-default population size, the
default settings of hyper-parameters of CMA-ES are far from being optimal, and
that self-CMA-ES allows for dynamically approaching optimal settings.Comment: 13th International Conference on Parallel Problem Solving from Nature
(PPSN 2014) (2014
Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning in robotics are
model-based policy search algorithms, which alternate between learning a
dynamical model of the robot and optimizing a policy to maximize the expected
return given the model and its uncertainties. Among the few proposed
approaches, the recently introduced Black-DROPS algorithm exploits a black-box
optimization algorithm to achieve both high data-efficiency and good
computation times when several cores are used; nevertheless, like all
model-based policy search approaches, Black-DROPS does not scale to high
dimensional state/action spaces. In this paper, we introduce a new model
learning procedure in Black-DROPS that leverages parameterized black-box priors
to (1) scale up to high-dimensional systems, and (2) be robust to large
inaccuracies of the prior information. We demonstrate the effectiveness of our
approach with the "pendubot" swing-up task in simulation and with a physical
hexapod robot (48D state space, 18D action space) that has to walk forward as
fast as possible. The results show that our new algorithm is more
data-efficient than previous model-based policy search algorithms (with and
without priors) and that it can allow a physical 6-legged robot to learn new
gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table;
Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at
https://youtu.be/_MZYDhfWeL
A view of Estimation of Distribution Algorithms through the lens of Expectation-Maximization
We show that a large class of Estimation of Distribution Algorithms,
including, but not limited to, Covariance Matrix Adaption, can be written as a
Monte Carlo Expectation-Maximization algorithm, and as exact EM in the limit of
infinite samples. Because EM sits on a rigorous statistical foundation and has
been thoroughly analyzed, this connection provides a new coherent framework
with which to reason about EDAs
Path Signatures for Diversity in Probabilistic Trajectory Optimisation
Motion planning can be cast as a trajectory optimisation problem where a cost
is minimised as a function of the trajectory being generated. In complex
environments with several obstacles and complicated geometry, this optimisation
problem is usually difficult to solve and prone to local minima. However,
recent advancements in computing hardware allow for parallel trajectory
optimisation where multiple solutions are obtained simultaneously, each
initialised from a different starting point. Unfortunately, without a strategy
preventing two solutions to collapse on each other, naive parallel optimisation
can suffer from mode collapse diminishing the efficiency of the approach and
the likelihood of finding a global solution. In this paper we leverage on
recent advances in the theory of rough paths to devise an algorithm for
parallel trajectory optimisation that promotes diversity over the range of
solutions, therefore avoiding mode collapses and achieving better global
properties. Our approach builds on path signatures and Hilbert space
representations of trajectories, and connects parallel variational inference
for trajectory estimation with diversity promoting kernels. We empirically
demonstrate that this strategy achieves lower average costs than competing
alternatives on a range of problems, from 2D navigation to robotic manipulators
operating in cluttered environments
CMA-ES with Learning Rate Adaptation: Can CMA-ES with Default Population Size Solve Multimodal and Noisy Problems?
The covariance matrix adaptation evolution strategy (CMA-ES) is one of the
most successful methods for solving black-box continuous optimization problems.
One practically useful aspect of the CMA-ES is that it can be used without
hyperparameter tuning. However, the hyperparameter settings still have a
considerable impact, especially for difficult tasks such as solving multimodal
or noisy problems. In this study, we investigate whether the CMA-ES with
default population size can solve multimodal and noisy problems. To perform
this investigation, we develop a novel learning rate adaptation mechanism for
the CMA-ES, such that the learning rate is adapted so as to maintain a constant
signal-to-noise ratio. We investigate the behavior of the CMA-ES with the
proposed learning rate adaptation mechanism through numerical experiments, and
compare the results with those obtained for the CMA-ES with a fixed learning
rate. The results demonstrate that, when the proposed learning rate adaptation
is used, the CMA-ES with default population size works well on multimodal
and/or noisy problems, without the need for extremely expensive learning rate
tuning.Comment: Nominated for the best paper of GECCO'23 ENUM Track. We have
corrected the error of Eq.(7
Information-geometric optimization with natural selection
Evolutionary algorithms, inspired by natural evolution, aim to optimize
difficult objective functions without computing derivatives. Here we detail the
relationship between population genetics and evolutionary optimization and
formulate a new evolutionary algorithm. Optimization of a continuous objective
function is analogous to searching for high fitness phenotypes on a fitness
landscape. We summarize how natural selection moves a population along the
non-euclidean gradient that is induced by the population on the fitness
landscape (the natural gradient). Under normal approximations common in
quantitative genetics, we show how selection is related to Newton's method in
optimization. We find that intermediate selection is most informative of the
fitness landscape. We describe the generation of new phenotypes and introduce
an operator that recombines the whole population to generate variants that
preserve normal statistics. Finally, we introduce a proof-of-principle
algorithm that combines natural selection, our recombination operator, and an
adaptive method to increase selection. Our algorithm is similar to covariance
matrix adaptation and natural evolutionary strategies in optimization, and has
similar performance. The algorithm is extremely simple in implementation with
no matrix inversion or factorization, does not require storing a covariance
matrix, and may form the basis of more general model-based optimization
algorithms with natural gradient updates.Comment: changed titl
- …