1,407 research outputs found
Cardinality Restricted Boltzmann Machines
The Restricted Boltzmann Machine (RBM) is a popular density model that is also good for extracting features. A main source of tractability in RBM models is that, given an input, the posterior distribution over hidden variables is factorizable and can be easily computed and sampled from. Sparsity and competition in the hidden representation is beneficial, and while an RBM with competition among its hidden units would acquire some of the attractive properties of sparse coding, such constraints are typically not added, as the resulting posterior over the hidden units seemingly becomes intractable. In this paper we show that a dynamic programming algorithm can be used to implement exact sparsity in the RBM’s hidden units. We also show how to pass derivatives through the resulting posterior marginals, which makes it possible to fine-tune a pre-trained neural network with sparse hidden layers.Engineering and Applied Science
Refinements of Universal Approximation Results for Deep Belief Networks and Restricted Boltzmann Machines
We improve recently published results about resources of Restricted Boltzmann
Machines (RBM) and Deep Belief Networks (DBN) required to make them Universal
Approximators. We show that any distribution p on the set of binary vectors of
length n can be arbitrarily well approximated by an RBM with k-1 hidden units,
where k is the minimal number of pairs of binary vectors differing in only one
entry such that their union contains the support set of p. In important cases
this number is half of the cardinality of the support set of p. We construct a
DBN with 2^n/2(n-b), b ~ log(n), hidden layers of width n that is capable of
approximating any distribution on {0,1}^n arbitrarily well. This confirms a
conjecture presented by Le Roux and Bengio 2010
Hierarchical Models as Marginals of Hierarchical Models
We investigate the representation of hierarchical models in terms of
marginals of other hierarchical models with smaller interactions. We focus on
binary variables and marginals of pairwise interaction models whose hidden
variables are conditionally independent given the visible variables. In this
case the problem is equivalent to the representation of linear subspaces of
polynomials by feedforward neural networks with soft-plus computational units.
We show that every hidden variable can freely model multiple interactions among
the visible variables, which allows us to generalize and improve previous
results. In particular, we show that a restricted Boltzmann machine with less
than hidden binary variables can approximate
every distribution of visible binary variables arbitrarily well, compared
to from the best previously known result.Comment: 18 pages, 4 figures, 2 tables, WUPES'1
Geometry and Expressive Power of Conditional Restricted Boltzmann Machines
Conditional restricted Boltzmann machines are undirected stochastic neural
networks with a layer of input and output units connected bipartitely to a
layer of hidden units. These networks define models of conditional probability
distributions on the states of the output units given the states of the input
units, parametrized by interaction weights and biases. We address the
representational power of these models, proving results their ability to
represent conditional Markov random fields and conditional distributions with
restricted supports, the minimal size of universal approximators, the maximal
model approximation errors, and on the dimension of the set of representable
conditional distributions. We contribute new tools for investigating
conditional probability models, which allow us to improve the results that can
be derived from existing work on restricted Boltzmann machine probability
models.Comment: 30 pages, 5 figures, 1 algorith
A Theory of Cheap Control in Embodied Systems
We present a framework for designing cheap control architectures for embodied
agents. Our derivation is guided by the classical problem of universal
approximation, whereby we explore the possibility of exploiting the agent's
embodiment for a new and more efficient universal approximation of behaviors
generated by sensorimotor control. This embodied universal approximation is
compared with the classical non-embodied universal approximation. To exemplify
our approach, we present a detailed quantitative case study for policy models
defined in terms of conditional restricted Boltzmann machines. In contrast to
non-embodied universal approximation, which requires an exponential number of
parameters, in the embodied setting we are able to generate all possible
behaviors with a drastically smaller model, thus obtaining cheap universal
approximation. We test and corroborate the theory experimentally with a
six-legged walking machine. The experiments show that the sufficient controller
complexity predicted by our theory is tight, which means that the theory has
direct practical implications. Keywords: cheap design, embodiment, sensorimotor
loop, universal approximation, conditional restricted Boltzmann machineComment: 27 pages, 10 figure
When Does a Mixture of Products Contain a Product of Mixtures?
We derive relations between theoretical properties of restricted Boltzmann
machines (RBMs), popular machine learning models which form the building blocks
of deep learning models, and several natural notions from discrete mathematics
and convex geometry. We give implications and equivalences relating
RBM-representable probability distributions, perfectly reconstructible inputs,
Hamming modes, zonotopes and zonosets, point configurations in hyperplane
arrangements, linear threshold codes, and multi-covering numbers of hypercubes.
As a motivating application, we prove results on the relative representational
power of mixtures of product distributions and products of mixtures of pairs of
product distributions (RBMs) that formally justify widely held intuitions about
distributed representations. In particular, we show that a mixture of products
requiring an exponentially larger number of parameters is needed to represent
the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table
- …