Location of Repository

On the quantitative analysis of Deep Belief Networks

By Ruslan Salakhutdinov and Iain Murray


Deep Belief Networks (DBN’s) are generative models that contain many layers of hidden variables. Efficient greedy algorithms for learning and approximate inference have allowed these models to be applied successfully in many application domains. The main building block of a DBN is a bipartite undirected graphical model called a restricted Boltzmann machine (RBM). Due to the presence of the partition function, model selection, complexity control, and exact maximum likelihood learning in RBM's are intractable. We show that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and we present a novel AIS scheme for comparing RBM's with different architectures. We further show how an AIS estimator, along with approximate inference, can be used to estimate a lower bound on the log-probability that a DBN model with multiple hidden layers assigns to the test data. This is, to our knowledge, the first step towards obtaining quantitative results that would allow us to directly assess the performance of Deep Belief Networks as generative models of data

Year: 2008
DOI identifier: 10.1145/1390156.1390266
OAI identifier: oai:www.era.lib.ed.ac.uk:1842/4588

Suggested articles



  1. (2006). A fast learning algorithm for deep belief nets. doi
  2. (2005). A new class of upper bounds on the log partition function. doi
  3. (2001). Annealed importance sampling.
  4. (2005). Constructing free-energy approximations and generalized belief propagation algorithms. doi
  5. (2005). Estimating ratios of normalizing constants using linked importance sampling
  6. (2006). Modeling human motion using binary latent variables.
  7. (2008). Modeling image patches with a directed hierarchy of Markov random fields.
  8. (2004). Nested sampling. Bayesian inference and maximum entropy methods doi
  9. (2005). On contrastive divergence learning.
  10. (1993). Probabilistic inference using Markov chain Monte Carlo methods doi
  11. (2006). Reducing the dimensionality of data with neural networks. doi
  12. (2008). Representational power of restricted Boltzmann machines and deep belief networks. doi
  13. (2007). Restricted Boltzmann machines for collaborative filtering. doi
  14. (2007). Scaling learning algorithms towards AI. Large-Scale Kernel Machines.
  15. (2006). The Rate Adapting Poisson (RAP) model for information retrieval and object recognition. doi
  16. (2002). Training products of experts by minimizing contrastive divergence. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.