273 research outputs found
CMA-ES with Two-Point Step-Size Adaptation
We combine a refined version of two-point step-size adaptation with the
covariance matrix adaptation evolution strategy (CMA-ES). Additionally, we
suggest polished formulae for the learning rate of the covariance matrix and
the recombination weights. In contrast to cumulative step-size adaptation or to
the 1/5-th success rule, the refined two-point adaptation (TPA) does not rely
on any internal model of optimality. In contrast to conventional
self-adaptation, the TPA will achieve a better target step-size in particular
with large populations. The disadvantage of TPA is that it relies on two
additional objective functio
The Hessian Estimation Evolution Strategy
We present a novel black box optimization algorithm called Hessian Estimation
Evolution Strategy. The algorithm updates the covariance matrix of its sampling
distribution by directly estimating the curvature of the objective function.
This algorithm design is targeted at twice continuously differentiable
problems. For this, we extend the cumulative step-size adaptation algorithm of
the CMA-ES to mirrored sampling. We demonstrate that our approach to covariance
matrix adaptation is efficient by evaluation it on the BBOB/COCO testbed. We
also show that the algorithm is surprisingly robust when its core assumption of
a twice continuously differentiable objective function is violated. The
approach yields a new evolution strategy with competitive performance, and at
the same time it also offers an interesting alternative to the usual covariance
matrix update mechanism
Adaptive Ranking Based Constraint Handling for Explicitly Constrained Black-Box Optimization
A novel explicit constraint handling technique for the covariance matrix
adaptation evolution strategy (CMA-ES) is proposed. The proposed constraint
handling exhibits two invariance properties. One is the invariance to arbitrary
element-wise increasing transformation of the objective and constraint
functions. The other is the invariance to arbitrary affine transformation of
the search space. The proposed technique virtually transforms a constrained
optimization problem into an unconstrained optimization problem by considering
an adaptive weighted sum of the ranking of the objective function values and
the ranking of the constraint violations that are measured by the Mahalanobis
distance between each candidate solution to its projection onto the boundary of
the constraints. Simulation results are presented and show that the CMA-ES with
the proposed constraint handling exhibits the affine invariance and performs
similarly to the CMA-ES on unconstrained counterparts.Comment: 9 page
Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles
We present a canonical way to turn any smooth parametric family of
probability distributions on an arbitrary search space into a
continuous-time black-box optimization method on , the
\emph{information-geometric optimization} (IGO) method. Invariance as a design
principle minimizes the number of arbitrary choices. The resulting \emph{IGO
flow} conducts the natural gradient ascent of an adaptive, time-dependent,
quantile-based transformation of the objective function. It makes no
assumptions on the objective function to be optimized.
The IGO method produces explicit IGO algorithms through time discretization.
It naturally recovers versions of known algorithms and offers a systematic way
to derive new ones. The cross-entropy method is recovered in a particular case,
and can be extended into a smoothed, parametrization-independent maximum
likelihood update (IGO-ML). For Gaussian distributions on , IGO
is related to natural evolution strategies (NES) and recovers a version of the
CMA-ES algorithm. For Bernoulli distributions on , we recover the
PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm
for optimization on . All these algorithms are unified under a
single information-geometric optimization framework.
Thanks to its intrinsic formulation, the IGO method achieves invariance under
reparametrization of the search space , under a change of parameters of the
probability distributions, and under increasing transformations of the
objective function.
Theory strongly suggests that IGO algorithms have minimal loss in diversity
during optimization, provided the initial diversity is high. First experiments
using restricted Boltzmann machines confirm this insight. Thus IGO seems to
provide, from information theory, an elegant way to spontaneously explore
several valleys of a fitness landscape in a single run.Comment: Final published versio
Model-based relative entropy stochastic search
Stochastic search algorithms are general black-box optimizers. Due to their ease
of use and their generality, they have recently also gained a lot of attention in operations
research, machine learning and policy search. Yet, these algorithms require
a lot of evaluations of the objective, scale poorly with the problem dimension, are
affected by highly noisy objective functions and may converge prematurely. To
alleviate these problems, we introduce a new surrogate-based stochastic search
approach. We learn simple, quadratic surrogate models of the objective function.
As the quality of such a quadratic approximation is limited, we do not greedily exploit
the learned models. The algorithm can be misled by an inaccurate optimum
introduced by the surrogate. Instead, we use information theoretic constraints to
bound the ‘distance’ between the new and old data distribution while maximizing
the objective function. Additionally the new method is able to sustain the exploration
of the search distribution to avoid premature convergence. We compare our
method with state of art black-box optimization methods on standard uni-modal
and multi-modal optimization functions, on simulated planar robot tasks and a
complex robot ball throwing task. The proposed method considerably outperforms
the existing approaches
Linear Convergence of Comparison-based Step-size Adaptive Randomized Search via Stability of Markov Chains
In this paper, we consider comparison-based adaptive stochastic algorithms
for solving numerical optimisation problems. We consider a specific subclass of
algorithms that we call comparison-based step-size adaptive randomized search
(CB-SARS), where the state variables at a given iteration are a vector of the
search space and a positive parameter, the step-size, typically controlling the
overall standard deviation of the underlying search distribution.We investigate
the linear convergence of CB-SARS on\emph{scaling-invariant} objective
functions. Scaling-invariantfunctions preserve the ordering of points with
respect to their functionvalue when the points are scaled with the same
positive parameter (thescaling is done w.r.t. a fixed reference point). This
class offunctions includes norms composed with strictly increasing functions
aswell as many non quasi-convex and non-continuousfunctions. On
scaling-invariant functions, we show the existence of ahomogeneous Markov
chain, as a consequence of natural invarianceproperties of CB-SARS (essentially
scale-invariance and invariance tostrictly increasing transformation of the
objective function). We thenderive sufficient conditions for \emph{global
linear convergence} ofCB-SARS, expressed in terms of different stability
conditions of thenormalised homogeneous Markov chain (irreducibility,
positivity, Harrisrecurrence, geometric ergodicity) and thus define a general
methodologyfor proving global linear convergence of CB-SARS algorithms
onscaling-invariant functions. As a by-product we provide aconnexion between
comparison-based adaptive stochasticalgorithms and Markov chain Monte Carlo
algorithms.Comment: SIAM Journal on Optimization, Society for Industrial and Applied
Mathematics, 201
Efficient Covariance Matrix Update for Variable Metric Evolution Strategies
International audienceRandomized direct search algorithms for continuous domains, such as Evolution Strategies, are basic tools in machine learning. They are especially needed when the gradient of an objective function (e.g., loss, energy, or reward function) cannot be computed or estimated efficiently. Application areas include supervised and reinforcement learning as well as model selection. These randomized search strategies often rely on normally distributed additive variations of candidate solutions. In order to efficiently search in non-separable and ill-conditioned landscapes the covariance matrix of the normal distribution must be adapted, amounting to a variable metric method. Consequently, Covariance Matrix Adaptation (CMA) is considered state-of-the-art in Evolution Strategies. In order to sample the normal distribution, the adapted covariance matrix needs to be decomposed, requiring in general operations, where is the search space dimension. We propose a new update mechanism which can replace a rank-one covariance matrix update and the computationally expensive decomposition of the covariance matrix. The newly developed update rule reduces the computational complexity of the rank-one covariance matrix adaptation to without resorting to outdated distributions. We derive new versions of the elitist Covariance Matrix Adaptation Evolution Strategy (CMA-ES) and the multi-objective CMA-ES. These algorithms are equivalent to the original procedures except that the update step for the variable metric distribution scales better in the problem dimension. We also introduce a simplified variant of the non-elitist CMA-ES with the incremental covariance matrix update and investigate its performance. Apart from the reduced time-complexity of the distribution update, the algebraic computations involved in all new algorithms are simpler compared to the original versions. The new update rule improves the performance of the CMA-ES for large scale machine learning problems in which the objective function can be evaluated fast
Escaping local minima with derivative-free methods: a numerical investigation
We apply a state-of-the-art, local derivative-free solver, Py-BOBYQA, to
global optimization problems, and propose an algorithmic improvement that is
beneficial in this context. Our numerical findings are illustrated on a
commonly-used but small-scale test set of global optimization problems and
associated noisy variants, and on hyperparameter tuning for the machine
learning test set MNIST. As Py-BOBYQA is a model-based trust-region method, we
compare mostly (but not exclusively) with other global optimization methods for
which (global) models are important, such as Bayesian optimization and response
surface methods; we also consider state-of-the-art representative deterministic
and stochastic codes, such as DIRECT and CMA-ES. As a heuristic for escaping
local minima, we find numerically that Py-BOBYQA is competitive with global
optimization solvers for all accuracy/budget regimes, in both smooth and noisy
settings. In particular, Py-BOBYQA variants are best performing for smooth and
multiplicative noise problems in high-accuracy regimes. As a by-product, some
preliminary conclusions can be drawn on the relative performance of the global
solvers we have tested with default settings
- …