686 research outputs found
Refinements of Universal Approximation Results for Deep Belief Networks and Restricted Boltzmann Machines
We improve recently published results about resources of Restricted Boltzmann
Machines (RBM) and Deep Belief Networks (DBN) required to make them Universal
Approximators. We show that any distribution p on the set of binary vectors of
length n can be arbitrarily well approximated by an RBM with k-1 hidden units,
where k is the minimal number of pairs of binary vectors differing in only one
entry such that their union contains the support set of p. In important cases
this number is half of the cardinality of the support set of p. We construct a
DBN with 2^n/2(n-b), b ~ log(n), hidden layers of width n that is capable of
approximating any distribution on {0,1}^n arbitrarily well. This confirms a
conjecture presented by Le Roux and Bengio 2010
Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks
We establish upper bounds for the minimal number of hidden units for which a
binary stochastic feedforward network with sigmoid activation probabilities and
a single hidden layer is a universal approximator of Markov kernels. We show
that each possible probabilistic assignment of the states of output units,
given the states of input units, can be approximated arbitrarily well
by a network with hidden units.Comment: 13 pages, 3 figure
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
We generalize recent theoretical work on the minimal number of layers of
narrow deep belief networks that can approximate any probability distribution
on the states of their visible units arbitrarily well. We relax the setting of
binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010;
Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the
vanishing approximation error to an arbitrary approximation error tolerance.
For example, we show that a -ary deep belief network with layers of width for some can approximate any probability
distribution on without exceeding a Kullback-Leibler
divergence of . Our analysis covers discrete restricted Boltzmann
machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl
Universal Approximation with Deep Narrow Networks
The classical Universal Approximation Theorem holds for neural networks of
arbitrary width and bounded depth. Here we consider the natural `dual' scenario
for networks of bounded width and arbitrary depth. Precisely, let be the
number of inputs neurons, be the number of output neurons, and let
be any nonaffine continuous function, with a continuous nonzero derivative at
some point. Then we show that the class of neural networks of arbitrary depth,
width , and activation function , is dense in for with compact. This covers
every activation function possible to use in practice, and also includes
polynomial activation functions, which is unlike the classical version of the
theorem, and provides a qualitative difference between deep narrow networks and
shallow wide networks. We then consider several extensions of this result. In
particular we consider nowhere differentiable activation functions, density in
noncompact domains with respect to the -norm, and how the width may be
reduced to just for `most' activation functions.Comment: Accepted at COLT 202
Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation
The study of universal approximation properties (UAP) for neural networks
(NN) has a long history. When the network width is unlimited, only a single
hidden layer is sufficient for UAP. In contrast, when the depth is unlimited,
the width for UAP needs to be not less than the critical width
, where and are the dimensions of the
input and output, respectively. Recently, \cite{cai2022achieve} shows that a
leaky-ReLU NN with this critical width can achieve UAP for functions on a
compact domain , \emph{i.e.,} the UAP for . This
paper examines a uniform UAP for the function class and
gives the exact minimum width of the leaky-ReLU NN as
, which involves the effects of the
output dimensions. To obtain this result, we propose a novel
lift-flow-discretization approach that shows that the uniform UAP has a deep
connection with topological theory.Comment: ICML2023 camera read
Geometry and Expressive Power of Conditional Restricted Boltzmann Machines
Conditional restricted Boltzmann machines are undirected stochastic neural
networks with a layer of input and output units connected bipartitely to a
layer of hidden units. These networks define models of conditional probability
distributions on the states of the output units given the states of the input
units, parametrized by interaction weights and biases. We address the
representational power of these models, proving results their ability to
represent conditional Markov random fields and conditional distributions with
restricted supports, the minimal size of universal approximators, the maximal
model approximation errors, and on the dimension of the set of representable
conditional distributions. We contribute new tools for investigating
conditional probability models, which allow us to improve the results that can
be derived from existing work on restricted Boltzmann machine probability
models.Comment: 30 pages, 5 figures, 1 algorith
- β¦