292 research outputs found
Refinements of Universal Approximation Results for Deep Belief Networks and Restricted Boltzmann Machines
We improve recently published results about resources of Restricted Boltzmann
Machines (RBM) and Deep Belief Networks (DBN) required to make them Universal
Approximators. We show that any distribution p on the set of binary vectors of
length n can be arbitrarily well approximated by an RBM with k-1 hidden units,
where k is the minimal number of pairs of binary vectors differing in only one
entry such that their union contains the support set of p. In important cases
this number is half of the cardinality of the support set of p. We construct a
DBN with 2^n/2(n-b), b ~ log(n), hidden layers of width n that is capable of
approximating any distribution on {0,1}^n arbitrarily well. This confirms a
conjecture presented by Le Roux and Bengio 2010
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
We generalize recent theoretical work on the minimal number of layers of
narrow deep belief networks that can approximate any probability distribution
on the states of their visible units arbitrarily well. We relax the setting of
binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010;
Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the
vanishing approximation error to an arbitrary approximation error tolerance.
For example, we show that a -ary deep belief network with layers of width for some can approximate any probability
distribution on without exceeding a Kullback-Leibler
divergence of . Our analysis covers discrete restricted Boltzmann
machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl
Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks
We establish upper bounds for the minimal number of hidden units for which a
binary stochastic feedforward network with sigmoid activation probabilities and
a single hidden layer is a universal approximator of Markov kernels. We show
that each possible probabilistic assignment of the states of output units,
given the states of input units, can be approximated arbitrarily well
by a network with hidden units.Comment: 13 pages, 3 figure
In All Likelihood, Deep Belief Is Not Enough
Statistical models of natural stimuli provide an important tool for
researchers in the fields of machine learning and computational neuroscience. A
canonical way to quantitatively assess and compare the performance of
statistical models is given by the likelihood. One class of statistical models
which has recently gained increasing popularity and has been applied to a
variety of complex data are deep belief networks. Analyses of these models,
however, have been typically limited to qualitative analyses based on samples
due to the computationally intractable nature of the model likelihood.
Motivated by these circumstances, the present article provides a consistent
estimator for the likelihood that is both computationally tractable and simple
to apply in practice. Using this estimator, a deep belief network which has
been suggested for the modeling of natural image patches is quantitatively
investigated and compared to other models of natural image patches. Contrary to
earlier claims based on qualitative results, the results presented in this
article provide evidence that the model under investigation is not a
particularly good model for natural image
Hierarchical Models as Marginals of Hierarchical Models
We investigate the representation of hierarchical models in terms of
marginals of other hierarchical models with smaller interactions. We focus on
binary variables and marginals of pairwise interaction models whose hidden
variables are conditionally independent given the visible variables. In this
case the problem is equivalent to the representation of linear subspaces of
polynomials by feedforward neural networks with soft-plus computational units.
We show that every hidden variable can freely model multiple interactions among
the visible variables, which allows us to generalize and improve previous
results. In particular, we show that a restricted Boltzmann machine with less
than hidden binary variables can approximate
every distribution of visible binary variables arbitrarily well, compared
to from the best previously known result.Comment: 18 pages, 4 figures, 2 tables, WUPES'1
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
- …