Search CORE

5,973 research outputs found

Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles

Author: Arnold Ludovic
Auger Anne
Hansen Nikolaus
Ollivier Yann
Publication venue
Publication date: 16/08/2016
Field of study

We present a canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space

X

into a continuous-time black-box optimization method on

X

, the \emph{information-geometric optimization} (IGO) method. Invariance as a design principle minimizes the number of arbitrary choices. The resulting \emph{IGO flow} conducts the natural gradient ascent of an adaptive, time-dependent, quantile-based transformation of the objective function. It makes no assumptions on the objective function to be optimized. The IGO method produces explicit IGO algorithms through time discretization. It naturally recovers versions of known algorithms and offers a systematic way to derive new ones. The cross-entropy method is recovered in a particular case, and can be extended into a smoothed, parametrization-independent maximum likelihood update (IGO-ML). For Gaussian distributions on

\mathbb{R}^d

, IGO is related to natural evolution strategies (NES) and recovers a version of the CMA-ES algorithm. For Bernoulli distributions on

\{0,1\}^d

, we recover the PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm for optimization on

\{0,1\}^d

. All these algorithms are unified under a single information-geometric optimization framework. Thanks to its intrinsic formulation, the IGO method achieves invariance under reparametrization of the search space

X

, under a change of parameters of the probability distributions, and under increasing transformations of the objective function. Theory strongly suggests that IGO algorithms have minimal loss in diversity during optimization, provided the initial diversity is high. First experiments using restricted Boltzmann machines confirm this insight. Thus IGO seems to provide, from information theory, an elegant way to spontaneously explore several valleys of a fitness landscape in a single run.Comment: Final published versio

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Denoising Autoencoders for fast Combinatorial Black Box Optimization

Author: Bengio Y.
Deb K.
Krizhevsky A.
Larrañaga P.
Miller B. L.
Pelikan M.
Watson R. A.
Publication venue
Publication date: 21/09/2015
Field of study

Estimation of Distribution Algorithms (EDAs) require flexible probability models that can be efficiently learned and sampled. Autoencoders (AE) are generative stochastic networks with these desired properties. We integrate a special type of AE, the Denoising Autoencoder (DAE), into an EDA and evaluate the performance of DAE-EDA on several combinatorial optimization problems with a single objective. We asses the number of fitness evaluations as well as the required CPU times. We compare the results to the performance to the Bayesian Optimization Algorithm (BOA) and RBM-EDA, another EDA which is based on a generative neural network which has proven competitive with BOA. For the considered problem instances, DAE-EDA is considerably faster than BOA and RBM-EDA, sometimes by orders of magnitude. The number of fitness evaluations is higher than for BOA, but competitive with RBM-EDA. These results show that DAEs can be useful tools for problems with low but non-negligible fitness evaluation costs.Comment: corrected typos and small inconsistencie

arXiv.org e-Print Archive

Crossref

Learning Dynamic Boltzmann Distributions as Reduced Models of Spatial Chemical Kinetics

Author: Bartol Thomas
Ernst Oliver K.
Mjolsness Eric
Sejnowski Terrence
Publication venue: 'AIP Publishing'
Publication date: 02/03/2018
Field of study

Finding reduced models of spatially-distributed chemical reaction networks requires an estimation of which effective dynamics are relevant. We propose a machine learning approach to this coarse graining problem, where a maximum entropy approximation is constructed that evolves slowly in time. The dynamical model governing the approximation is expressed as a functional, allowing a general treatment of spatial interactions. In contrast to typical machine learning approaches which estimate the interaction parameters of a graphical model, we derive Boltzmann-machine like learning algorithms to estimate directly the functionals dictating the time evolution of these parameters. By incorporating analytic solutions from simple reaction motifs, an efficient simulation method is demonstrated for systems ranging from toy problems to basic biologically relevant networks. The broadly applicable nature of our approach to learning spatial dynamics suggests promising applications to multiscale methods for spatial networks, as well as to further problems in machine learning

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Riemann-Theta Boltzmann Machine

Author: Carrazza Stefano
Haghighat Babak
Kahlen Jens
Krefl Daniel
Publication venue: 'Elsevier BV'
Publication date: 13/01/2020
Field of study

A general Boltzmann machine with continuous visible and discrete integer valued hidden states is introduced. Under mild assumptions about the connection matrices, the probability density function of the visible units can be solved for analytically, yielding a novel parametric density function involving a ratio of Riemann-Theta functions. The conditional expectation of a hidden state for given visible states can also be calculated analytically, yielding a derivative of the logarithmic Riemann-Theta function. The conditional expectation can be used as activation function in a feedforward neural network, thereby increasing the modelling capacity of the network. Both the Boltzmann machine and the derived feedforward neural network can be successfully trained via standard gradient- and non-gradient-based optimization techniques.Comment: 29 pages, 11 figures, final version published in Neurocomputin

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

q-Gaussian based Smoothed Functional Algorithm for Stochastic Optimization

Author: Bhatnagar Shalabh
Dukkipati Ambedkar
Ghoshdastidar Debarghya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

The q-Gaussian distribution results from maximizing certain generalizations of Shannon entropy under some constraints. The importance of q-Gaussian distributions stems from the fact that they exhibit power-law behavior, and also generalize Gaussian distributions. In this paper, we propose a Smoothed Functional (SF) scheme for gradient estimation using q-Gaussian distribution, and also propose an algorithm for optimization based on the above scheme. Convergence results of the algorithm are presented. Performance of the proposed algorithm is shown by simulation results on a queuing model.Comment: 5 pages, 1 figur

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications