16,181 research outputs found
Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles
We present a canonical way to turn any smooth parametric family of
probability distributions on an arbitrary search space into a
continuous-time black-box optimization method on , the
\emph{information-geometric optimization} (IGO) method. Invariance as a design
principle minimizes the number of arbitrary choices. The resulting \emph{IGO
flow} conducts the natural gradient ascent of an adaptive, time-dependent,
quantile-based transformation of the objective function. It makes no
assumptions on the objective function to be optimized.
The IGO method produces explicit IGO algorithms through time discretization.
It naturally recovers versions of known algorithms and offers a systematic way
to derive new ones. The cross-entropy method is recovered in a particular case,
and can be extended into a smoothed, parametrization-independent maximum
likelihood update (IGO-ML). For Gaussian distributions on , IGO
is related to natural evolution strategies (NES) and recovers a version of the
CMA-ES algorithm. For Bernoulli distributions on , we recover the
PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm
for optimization on . All these algorithms are unified under a
single information-geometric optimization framework.
Thanks to its intrinsic formulation, the IGO method achieves invariance under
reparametrization of the search space , under a change of parameters of the
probability distributions, and under increasing transformations of the
objective function.
Theory strongly suggests that IGO algorithms have minimal loss in diversity
during optimization, provided the initial diversity is high. First experiments
using restricted Boltzmann machines confirm this insight. Thus IGO seems to
provide, from information theory, an elegant way to spontaneously explore
several valleys of a fitness landscape in a single run.Comment: Final published versio
The CMA Evolution Strategy: A Tutorial
This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands
for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized,
method for real-parameter (continuous domain) optimization of non-linear,
non-convex functions. We try to motivate and derive the algorithm from
intuitive concepts and from requirements of non-linear, non-convex search in
continuous domain.Comment: ArXiv e-prints, arXiv:1604.xxxx
A Computationally Efficient Limited Memory CMA-ES for Large Scale Optimization
We propose a computationally efficient limited memory Covariance Matrix
Adaptation Evolution Strategy for large scale optimization, which we call the
LM-CMA-ES. The LM-CMA-ES is a stochastic, derivative-free algorithm for
numerical optimization of non-linear, non-convex optimization problems in
continuous domain. Inspired by the limited memory BFGS method of Liu and
Nocedal (1989), the LM-CMA-ES samples candidate solutions according to a
covariance matrix reproduced from direction vectors selected during the
optimization process. The decomposition of the covariance matrix into Cholesky
factors allows to reduce the time and memory complexity of the sampling to
, where is the number of decision variables. When is large
(e.g., > 1000), even relatively small values of (e.g., ) are
sufficient to efficiently solve fully non-separable problems and to reduce the
overall run-time.Comment: Genetic and Evolutionary Computation Conference (GECCO'2014) (2014
Variable Metric Random Pursuit
We consider unconstrained randomized optimization of smooth convex objective
functions in the gradient-free setting. We analyze Random Pursuit (RP)
algorithms with fixed (F-RP) and variable metric (V-RP). The algorithms only
use zeroth-order information about the objective function and compute an
approximate solution by repeated optimization over randomly chosen
one-dimensional subspaces. The distribution of search directions is dictated by
the chosen metric.
Variable Metric RP uses novel variants of a randomized zeroth-order Hessian
approximation scheme recently introduced by Leventhal and Lewis (D. Leventhal
and A. S. Lewis., Optimization 60(3), 329--245, 2011). We here present (i) a
refined analysis of the expected single step progress of RP algorithms and
their global convergence on (strictly) convex functions and (ii) novel
convergence bounds for V-RP on strongly convex functions. We also quantify how
well the employed metric needs to match the local geometry of the function in
order for the RP algorithms to converge with the best possible rate.
Our theoretical results are accompanied by numerical experiments, comparing
V-RP with the derivative-free schemes CMA-ES, Implicit Filtering, Nelder-Mead,
NEWUOA, Pattern-Search and Nesterov's gradient-free algorithms.Comment: 42 pages, 6 figures, 15 tables, submitted to journal, Version 3:
majorly revised second part, i.e. Section 5 and Appendi
- …