Search CORE

391 research outputs found

Natural evolution strategies and variational Monte Carlo

Author: Carleo Giuseppe
Stokes James
Veerapaneni Shravan
Zhao Tianchen
Publication venue: 'IOP Publishing'
Publication date: 20/11/2020
Field of study

A notion of quantum natural evolution strategies is introduced, which provides a geometric synthesis of a number of known quantum/classical algorithms for performing classical black-box optimization. Recent work of Gomes et al. [2019] on heuristic combinatorial optimization using neural quantum states is pedagogically reviewed in this context, emphasizing the connection with natural evolution strategies. The algorithmic framework is illustrated for approximate combinatorial optimization problems, and a systematic strategy is found for improving the approximation ratios. In particular it is found that natural evolution strategies can achieve approximation ratios competitive with widely used heuristic algorithms for Max-Cut, at the expense of increased computation time

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Exponential Natural Evolution Strategies

Author: Glasmachers Tobias
Schaul Tom
Schmidhuber Jürgen
Wierstra Daan
Yi Sun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

The family of natural evolution strategies (NES) offers a principled approach to real-valued evolutionary optimization by following the natural gradient of the expected fitness. Like the well-known CMA-ES, the most competitive algorithm in the field, NES comes with important invariance properties. In this paper, we introduce a number of elegant and efficient improvements of the basic NES algorithm. First, we propose to parameterize the positive definite covariance matrix using the exponential map, which allows the covariance matrix to be updated in a vector space. This new technique makes the algorithm completely invariant under linear transformations of the underlying search space, which was previously achieved only in the limit of small step sizes. Second, we compute all updates in the natural coordinate system, such that the natural gradient coincides with the vanilla gradient. This way we avoid the computation of the inverse Fisher information matrix, which is the main computational bottleneck of the original NES algorithm. Our new algorithm, exponential NES (xNES), is significantly simpler than its predecessors. We show that the various update rules in CMA-ES are closely related to the natural gradient updates of xNES. However, xNES is more principled than CMA-ES, as all the update rules needed for covariance matrix adaptation are derived from a single principle. We empirically assess the performance of the new algorithm on standard benchmark function

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Natural Evolution Strategies as a Black Box Estimator for Stochastic Variational Inference

Author: Amin Ahmad Ayaz
Publication venue
Publication date: 15/08/2023
Field of study

Stochastic variational inference and its derivatives in the form of variational autoencoders enjoy the ability to perform Bayesian inference on large datasets in an efficient manner. However, performing inference with a VAE requires a certain design choice (i.e. reparameterization trick) to allow unbiased and low variance gradient estimation, restricting the types of models that can be created. To overcome this challenge, an alternative estimator based on natural evolution strategies is proposed. This estimator does not make assumptions about the kind of distributions used, allowing for the creation of models that would otherwise not have been possible under the VAE framework

arXiv.org e-Print Archive

Meta-Learning by the Baldwin Effect

Author: Fernando Chrisantha Thomas
Osindero Simon
Pritzel Alexander
Rusu Andrei A.
Schaul Tom
Sprechmann Pablo
Sygnowski Jakub
Teplyashin Denis
Wang Jane
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/06/2018
Field of study

The scope of the Baldwin effect was recently called into question by two papers that closely examined the seminal work of Hinton and Nowlan. To this date there has been no demonstration of its necessity in empirically challenging tasks. Here we show that the Baldwin effect is capable of evolving few-shot supervised and reinforcement learning mechanisms, by shaping the hyperparameters and the initial parameters of deep learning algorithms. Furthermore it can genetically accommodate strong learning biases on the same set of problems as a recent machine learning algorithm called MAML "Model Agnostic Meta-Learning" which uses second-order gradients instead of evolution to learn a set of reference parameters (initial weights) that can allow rapid adaptation to tasks sampled from a distribution. Whilst in simple cases MAML is more data efficient than the Baldwin effect, the Baldwin effect is more general in that it does not require gradients to be backpropagated to the reference parameters or hyperparameters, and permits effectively any number of gradient updates in the inner loop. The Baldwin effect learns strong learning dependent biases, rather than purely genetically accommodating fixed behaviours in a learning independent manner

arXiv.org e-Print Archive

Crossref

Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles

Author: Arnold Ludovic
Auger Anne
Hansen Nikolaus
Ollivier Yann
Publication venue
Publication date: 16/08/2016
Field of study

We present a canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space

X

into a continuous-time black-box optimization method on

X

, the \emph{information-geometric optimization} (IGO) method. Invariance as a design principle minimizes the number of arbitrary choices. The resulting \emph{IGO flow} conducts the natural gradient ascent of an adaptive, time-dependent, quantile-based transformation of the objective function. It makes no assumptions on the objective function to be optimized. The IGO method produces explicit IGO algorithms through time discretization. It naturally recovers versions of known algorithms and offers a systematic way to derive new ones. The cross-entropy method is recovered in a particular case, and can be extended into a smoothed, parametrization-independent maximum likelihood update (IGO-ML). For Gaussian distributions on

\mathbb{R}^d

, IGO is related to natural evolution strategies (NES) and recovers a version of the CMA-ES algorithm. For Bernoulli distributions on

\{0,1\}^d

, we recover the PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm for optimization on

\{0,1\}^d

. All these algorithms are unified under a single information-geometric optimization framework. Thanks to its intrinsic formulation, the IGO method achieves invariance under reparametrization of the search space

X

, under a change of parameters of the probability distributions, and under increasing transformations of the objective function. Theory strongly suggests that IGO algorithms have minimal loss in diversity during optimization, provided the initial diversity is high. First experiments using restricted Boltzmann machines confirm this insight. Thus IGO seems to provide, from information theory, an elegant way to spontaneously explore several valleys of a fitness landscape in a single run.Comment: Final published versio

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1