7,097 research outputs found
When Does a Mixture of Products Contain a Product of Mixtures?
We derive relations between theoretical properties of restricted Boltzmann
machines (RBMs), popular machine learning models which form the building blocks
of deep learning models, and several natural notions from discrete mathematics
and convex geometry. We give implications and equivalences relating
RBM-representable probability distributions, perfectly reconstructible inputs,
Hamming modes, zonotopes and zonosets, point configurations in hyperplane
arrangements, linear threshold codes, and multi-covering numbers of hypercubes.
As a motivating application, we prove results on the relative representational
power of mixtures of product distributions and products of mixtures of pairs of
product distributions (RBMs) that formally justify widely held intuitions about
distributed representations. In particular, we show that a mixture of products
requiring an exponentially larger number of parameters is needed to represent
the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
We generalize recent theoretical work on the minimal number of layers of
narrow deep belief networks that can approximate any probability distribution
on the states of their visible units arbitrarily well. We relax the setting of
binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010;
Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the
vanishing approximation error to an arbitrary approximation error tolerance.
For example, we show that a -ary deep belief network with layers of width for some can approximate any probability
distribution on without exceeding a Kullback-Leibler
divergence of . Our analysis covers discrete restricted Boltzmann
machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl
Geometry and Expressive Power of Conditional Restricted Boltzmann Machines
Conditional restricted Boltzmann machines are undirected stochastic neural
networks with a layer of input and output units connected bipartitely to a
layer of hidden units. These networks define models of conditional probability
distributions on the states of the output units given the states of the input
units, parametrized by interaction weights and biases. We address the
representational power of these models, proving results their ability to
represent conditional Markov random fields and conditional distributions with
restricted supports, the minimal size of universal approximators, the maximal
model approximation errors, and on the dimension of the set of representable
conditional distributions. We contribute new tools for investigating
conditional probability models, which allow us to improve the results that can
be derived from existing work on restricted Boltzmann machine probability
models.Comment: 30 pages, 5 figures, 1 algorith
- …