22 research outputs found

    Refinements of Universal Approximation Results for Deep Belief Networks and Restricted Boltzmann Machines

    Full text link
    We improve recently published results about resources of Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) required to make them Universal Approximators. We show that any distribution p on the set of binary vectors of length n can be arbitrarily well approximated by an RBM with k-1 hidden units, where k is the minimal number of pairs of binary vectors differing in only one entry such that their union contains the support set of p. In important cases this number is half of the cardinality of the support set of p. We construct a DBN with 2^n/2(n-b), b ~ log(n), hidden layers of width n that is capable of approximating any distribution on {0,1}^n arbitrarily well. This confirms a conjecture presented by Le Roux and Bengio 2010

    Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks

    Full text link
    We establish upper bounds for the minimal number of hidden units for which a binary stochastic feedforward network with sigmoid activation probabilities and a single hidden layer is a universal approximator of Markov kernels. We show that each possible probabilistic assignment of the states of nn output units, given the states of k1k\geq1 input units, can be approximated arbitrarily well by a network with 2k1(2n11)2^{k-1}(2^{n-1}-1) hidden units.Comment: 13 pages, 3 figure

    Reconstruction Low- Resolution Image Face Using Restricted Boltzmann Machine

    Get PDF
    Low-resolution (LR) face images are one of the most challenging problems in face recognition (FR) systems. Due to the difficulty of finding the specific features of faces, the accuracy of face recognition is low. To solve this problem, some researchers are using an image reconstruction approach to improve the resolution of their images. In this research, we are trying to use the restricted Boltzmann machine (RBM) to solve the problem. Furthermore, a labelled face in the wild (lfw) database has been used to validate the proposed method. The results of the experiment show that the PSNR and SSIM of the image result are 34.05 dB and 96.8%, respectively

    Hierarchical Models as Marginals of Hierarchical Models

    Full text link
    We investigate the representation of hierarchical models in terms of marginals of other hierarchical models with smaller interactions. We focus on binary variables and marginals of pairwise interaction models whose hidden variables are conditionally independent given the visible variables. In this case the problem is equivalent to the representation of linear subspaces of polynomials by feedforward neural networks with soft-plus computational units. We show that every hidden variable can freely model multiple interactions among the visible variables, which allows us to generalize and improve previous results. In particular, we show that a restricted Boltzmann machine with less than [2(log(v)+1)/(v+1)]2v1[ 2(\log(v)+1) / (v+1) ] 2^v-1 hidden binary variables can approximate every distribution of vv visible binary variables arbitrarily well, compared to 2v112^{v-1}-1 from the best previously known result.Comment: 18 pages, 4 figures, 2 tables, WUPES'1

    Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

    Full text link
    We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a qq-ary deep belief network with L2+qmδ1q1L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1} layers of width nm+logq(m)+1n \leq m + \log_q(m) + 1 for some mNm\in \mathbb{N} can approximate any probability distribution on {0,1,,q1}n\{0,1,\ldots,q-1\}^n without exceeding a Kullback-Leibler divergence of δ\delta. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl

    Neural network representation of tensor network and chiral states

    Get PDF
    We study the representational power of a Boltzmann machine (a type of neural network) in quantum many-body systems. We prove that any (local) tensor network state has a (local) neural network representation. The construction is almost optimal in the sense that the number of parameters in the neural network representation is almost linear in the number of nonzero parameters in the tensor network representation. Despite the difficulty of representing (gapped) chiral topological states with local tensor networks, we construct a quasi-local neural network representation for a chiral p-wave superconductor. This demonstrates the power of Boltzmann machines

    Mixture decompositions of exponential families using a decomposition of their sample spaces

    Get PDF
    We study the problem of finding the smallest mm such that every element of an exponential family can be written as a mixture of mm elements of another exponential family. We propose an approach based on coverings and packings of the face lattice of the corresponding convex support polytopes and results from coding theory. We show that m=qN1m=q^{N-1} is the smallest number for which any distribution of NN qq-ary variables can be written as mixture of mm independent qq-ary variables. Furthermore, we show that any distribution of NN binary variables is a mixture of m=2N(k+1)(1+1/(2k1))m = 2^{N-(k+1)}(1+ 1/(2^k-1)) elements of the kk-interaction exponential family.Comment: 17 pages, 2 figure
    corecore