Search CORE

1,407 research outputs found

Cardinality Restricted Boltzmann Machines

Author: Adams Ryan Prescott
Salakhutdinov Ruslan
Sutskever Ilya
Swersky Kevin
Tarlow Daniel
Zemel Richard
Publication venue: Massachusetts Institute of Technology Press
Publication date: 13/11/2013
Field of study

The Restricted Boltzmann Machine (RBM) is a popular density model that is also good for extracting features. A main source of tractability in RBM models is that, given an input, the posterior distribution over hidden variables is factorizable and can be easily computed and sampled from. Sparsity and competition in the hidden representation is beneficial, and while an RBM with competition among its hidden units would acquire some of the attractive properties of sparse coding, such constraints are typically not added, as the resulting posterior over the hidden units seemingly becomes intractable. In this paper we show that a dynamic programming algorithm can be used to implement exact sparsity in the RBM’s hidden units. We also show how to pass derivatives through the resulting posterior marginals, which makes it possible to fine-tune a pre-trained neural network with sparse hidden layers.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Refinements of Universal Approximation Results for Deep Belief Networks and Restricted Boltzmann Machines

Author: Ay Nihat
Montufar Guido
Publication venue
Publication date: 26/07/2010
Field of study

We improve recently published results about resources of Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) required to make them Universal Approximators. We show that any distribution p on the set of binary vectors of length n can be arbitrarily well approximated by an RBM with k-1 hidden units, where k is the minimal number of pairs of binary vectors differing in only one entry such that their union contains the support set of p. In important cases this number is half of the cardinality of the support set of p. We construct a DBN with 2^n/2(n-b), b ~ log(n), hidden layers of width n that is capable of approximating any distribution on {0,1}^n arbitrarily well. This confirms a conjecture presented by Le Roux and Bengio 2010

arXiv.org e-Print Archive

Hierarchical Models as Marginals of Hierarchical Models

Author: Montufar Guido
Rauh Johannes
Publication venue
Publication date: 07/03/2016
Field of study

We investigate the representation of hierarchical models in terms of marginals of other hierarchical models with smaller interactions. We focus on binary variables and marginals of pairwise interaction models whose hidden variables are conditionally independent given the visible variables. In this case the problem is equivalent to the representation of linear subspaces of polynomials by feedforward neural networks with soft-plus computational units. We show that every hidden variable can freely model multiple interactions among the visible variables, which allows us to generalize and improve previous results. In particular, we show that a restricted Boltzmann machine with less than

[ 2(\log(v)+1) / (v+1) ] 2^v-1

hidden binary variables can approximate every distribution of

v

visible binary variables arbitrarily well, compared to

2^{v-1}-1

from the best previously known result.Comment: 18 pages, 4 figures, 2 tables, WUPES'1

arXiv.org e-Print Archive

eScholarship - University of California

Geometry and Expressive Power of Conditional Restricted Boltzmann Machines

Author: Ay Nihat
Ghazi-Zahedi Keyan
Montufar Guido
Publication venue
Publication date: 01/01/2015
Field of study

Conditional restricted Boltzmann machines are undirected stochastic neural networks with a layer of input and output units connected bipartitely to a layer of hidden units. These networks define models of conditional probability distributions on the states of the output units given the states of the input units, parametrized by interaction weights and biases. We address the representational power of these models, proving results their ability to represent conditional Markov random fields and conditional distributions with restricted supports, the minimal size of universal approximators, the maximal model approximation errors, and on the dimension of the set of representable conditional distributions. We contribute new tools for investigating conditional probability models, which allow us to improve the results that can be derived from existing work on restricted Boltzmann machine probability models.Comment: 30 pages, 5 figures, 1 algorith

arXiv.org e-Print Archive

eScholarship - University of California

A Theory of Cheap Control in Embodied Systems

Author: B Sallans
C Paul
CE Shannon
CM Bishop
D Sol
DD Clark
DFB Haeufle
EA Rückert
G Montúfar
G Montúfar
G Montúfar
GE Hinton
GE Hinton
Guido Montúfar
GW Taylor
H Hauser
J Martens
J Pearl
JE Auerbach
Josh C. Bongard
K Aström
K Zahedi
K Zahedi
K Zahedi
Keyan Ghazi-Zahedi
L Berthouze
L Sokoloff
M Lungarella
N Ay
N Ay
N Le Roux
Nihat Ay
O Rivoire
R Pfeifer
R Pfeifer
R Pfeifer
R Pfeifer
RA Brooks
RA Brooks
T McGeer
V Braitenberg
VN Vapnik
W Bialek
Y Bengio
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 15/11/2014
Field of study

We present a framework for designing cheap control architectures for embodied agents. Our derivation is guided by the classical problem of universal approximation, whereby we explore the possibility of exploiting the agent's embodiment for a new and more efficient universal approximation of behaviors generated by sensorimotor control. This embodied universal approximation is compared with the classical non-embodied universal approximation. To exemplify our approach, we present a detailed quantitative case study for policy models defined in terms of conditional restricted Boltzmann machines. In contrast to non-embodied universal approximation, which requires an exponential number of parameters, in the embodied setting we are able to generate all possible behaviors with a drastically smaller model, thus obtaining cheap universal approximation. We test and corroborate the theory experimentally with a six-legged walking machine. The experiments show that the sufficient controller complexity predicted by our theory is tight, which means that the theory has direct practical implications. Keywords: cheap design, embodiment, sensorimotor loop, universal approximation, conditional restricted Boltzmann machineComment: 27 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

When Does a Mixture of Products Contain a Product of Mixtures?

Author: Montufar Guido F.
Morton Jason
Publication venue
Publication date: 01/01/2014
Field of study

We derive relations between theoretical properties of restricted Boltzmann machines (RBMs), popular machine learning models which form the building blocks of deep learning models, and several natural notions from discrete mathematics and convex geometry. We give implications and equivalences relating RBM-representable probability distributions, perfectly reconstructible inputs, Hamming modes, zonotopes and zonosets, point configurations in hyperplane arrangements, linear threshold codes, and multi-covering numbers of hypercubes. As a motivating application, we prove results on the relative representational power of mixtures of product distributions and products of mixtures of pairs of product distributions (RBMs) that formally justify widely held intuitions about distributed representations. In particular, we show that a mixture of products requiring an exponentially larger number of parameters is needed to represent the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California