Search CORE

16 research outputs found

Natural evolution strategies and variational Monte Carlo

Author: Carleo Giuseppe
Stokes James
Veerapaneni Shravan
Zhao Tianchen
Publication venue: 'IOP Publishing'
Publication date: 20/11/2020
Field of study

A notion of quantum natural evolution strategies is introduced, which provides a geometric synthesis of a number of known quantum/classical algorithms for performing classical black-box optimization. Recent work of Gomes et al. [2019] on heuristic combinatorial optimization using neural quantum states is pedagogically reviewed in this context, emphasizing the connection with natural evolution strategies. The algorithmic framework is illustrated for approximate combinatorial optimization problems, and a systematic strategy is found for improving the approximation ratios. In particular it is found that natural evolution strategies can achieve approximation ratios competitive with widely used heuristic algorithms for Max-Cut, at the expense of increased computation time

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Mixtures and products in two graphical models

Author: Montufar Guido
Seigal Anna
Publication venue
Publication date: 15/09/2017
Field of study

We compare two statistical models of three binary random variables. One is a mixture model and the other is a product of mixtures model called a restricted Boltzmann machine. Although the two models we study look different from their parametrizations, we show that they represent the same set of distributions on the interior of the probability simplex, and are equal up to closure. We give a semi-algebraic description of the model in terms of six binomial inequalities and obtain closed form expressions for the maximum likelihood estimates. We briefly discuss extensions to larger models.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

eScholarship - University of California

Oxford University Research Archive

MPG.PuRe

Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

Author: Montúfar Guido F.
Publication venue
Publication date: 01/01/2014
Field of study

We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a

q

-ary deep belief network with

L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1}

layers of width

n \leq m + \log_q(m) + 1

for some

m\in \mathbb{N}

can approximate any probability distribution on

\{0,1,\ldots,q-1\}^n

without exceeding a Kullback-Leibler divergence of

\delta

. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

Scaling of Model Approximation Errors and Expected Entropy Distances

Author: Montufar Guido F.
Rauh Johannes
Publication venue: 'Institute of Information Theory and Automation'
Publication date: 01/01/2002
Field of study

We compute the expected value of the Kullback-Leibler divergence to various fundamental statistical models with respect to canonical priors on the probability simplex. We obtain closed formulas for the expected model approximation errors, depending on the dimension of the models and the cardinalities of their sample spaces. For the uniform prior, the expected divergence from any model containing the uniform distribution is bounded by a constant

1-\gamma

, and for the models that we consider, this bound is approached if the state space is very large and the models' dimension does not grow too fast. For Dirichlet priors the expected divergence is bounded in a similar way, if the concentration parameters take reasonable values. These results serve as reference values for more complicated statistical models.Comment: 13 pages, 3 figures, WUPES'1

arXiv.org e-Print Archive

CiteSeerX

Crossref

Institute of Mathematics AS CR, v. v. i.

eScholarship - University of California

MPG.PuRe

Mixture decompositions of exponential families using a decomposition of their sample spaces

Author: A. GIMIGLIANO
A.V. GERAMITA
M.V. CATALISANO
Publication venue
Publication date: 25/03/2010
Field of study

We study the problem of finding the smallest

m

such that every element of an exponential family can be written as a mixture of

m

elements of another exponential family. We propose an approach based on coverings and packings of the face lattice of the corresponding convex support polytopes and results from coding theory. We show that

m=q^{N-1}

is the smallest number for which any distribution of

N

q

-ary variables can be written as mixture of

m

independent

q

-ary variables. Furthermore, we show that any distribution of

N

binary variables is a mixture of

m = 2^{N-(k+1)}(1+ 1/(2^k-1))

elements of the

k

-interaction exponential family.Comment: 17 pages, 2 figure

arXiv.org e-Print Archive

Institute of Mathematics AS CR, v. v. i.

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Genova

Open Access Repository

When Does a Mixture of Products Contain a Product of Mixtures?

Author: Montufar Guido F.
Morton Jason
Publication venue
Publication date: 01/01/2014
Field of study

We derive relations between theoretical properties of restricted Boltzmann machines (RBMs), popular machine learning models which form the building blocks of deep learning models, and several natural notions from discrete mathematics and convex geometry. We give implications and equivalences relating RBM-representable probability distributions, perfectly reconstructible inputs, Hamming modes, zonotopes and zonosets, point configurations in hyperplane arrangements, linear threshold codes, and multi-covering numbers of hypercubes. As a motivating application, we prove results on the relative representational power of mixtures of product distributions and products of mixtures of pairs of product distributions (RBMs) that formally justify widely held intuitions about distributed representations. In particular, we show that a mixture of products requiring an exponentially larger number of parameters is needed to represent the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California