9,227 research outputs found
A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data
Topic modeling based on latent Dirichlet allocation (LDA) has been a
framework of choice to deal with multimodal data, such as in image annotation
tasks. Another popular approach to model the multimodal data is through deep
neural networks, such as the deep Boltzmann machine (DBM). Recently, a new type
of topic model called the Document Neural Autoregressive Distribution Estimator
(DocNADE) was proposed and demonstrated state-of-the-art performance for text
document modeling. In this work, we show how to successfully apply and extend
this model to multimodal data, such as simultaneous image classification and
annotation. First, we propose SupDocNADE, a supervised extension of DocNADE,
that increases the discriminative power of the learned hidden topic features
and show how to employ it to learn a joint representation from image visual
words, annotation words and class label information. We test our model on the
LabelMe and UIUC-Sports data sets and show that it compares favorably to other
topic models. Second, we propose a deep extension of our model and provide an
efficient way of training the deep model. Experimental results show that our
deep model outperforms its shallow version and reaches state-of-the-art
performance on the Multimedia Information Retrieval (MIR) Flickr data set.Comment: 24 pages, 10 figures. A version has been accepted by TPAMI on Aug
4th, 2015. Add footnote about how to train the model in practice in Section
5.1. arXiv admin note: substantial text overlap with arXiv:1305.530
Learning Generative ConvNets via Multi-grid Modeling and Sampling
This paper proposes a multi-grid method for learning energy-based generative
ConvNet models of images. For each grid, we learn an energy-based probabilistic
model where the energy function is defined by a bottom-up convolutional neural
network (ConvNet or CNN). Learning such a model requires generating synthesized
examples from the model. Within each iteration of our learning algorithm, for
each observed training image, we generate synthesized images at multiple grids
by initializing the finite-step MCMC sampling from a minimal 1 x 1 version of
the training image. The synthesized image at each subsequent grid is obtained
by a finite-step MCMC initialized from the synthesized image generated at the
previous coarser grid. After obtaining the synthesized examples, the parameters
of the models at multiple grids are updated separately and simultaneously based
on the differences between synthesized and observed examples. We show that this
multi-grid method can learn realistic energy-based generative ConvNet models,
and it outperforms the original contrastive divergence (CD) and persistent CD.Comment: CVPR 201
SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model
To realize human-like robot intelligence, a large-scale cognitive
architecture is required for robots to understand the environment through a
variety of sensors with which they are equipped. In this paper, we propose a
novel framework named Serket that enables the construction of a large-scale
generative model and its inference easily by connecting sub-modules to allow
the robots to acquire various capabilities through interaction with their
environments and others. We consider that large-scale cognitive models can be
constructed by connecting smaller fundamental models hierarchically while
maintaining their programmatic independence. Moreover, connected modules are
dependent on each other, and parameters are required to be optimized as a
whole. Conventionally, the equations for parameter estimation have to be
derived and implemented depending on the models. However, it becomes harder to
derive and implement those of a larger scale model. To solve these problems, in
this paper, we propose a method for parameter estimation by communicating the
minimal parameters between various modules while maintaining their programmatic
independence. Therefore, Serket makes it easy to construct large-scale models
and estimate their parameters via the connection of modules. Experimental
results demonstrated that the model can be constructed by connecting modules,
the parameters can be optimized as a whole, and they are comparable with the
original models that we have proposed
Boosting Monte Carlo simulations of spin glasses using autoregressive neural networks
The autoregressive neural networks are emerging as a powerful computational
tool to solve relevant problems in classical and quantum mechanics. One of
their appealing functionalities is that, after they have learned a probability
distribution from a dataset, they allow exact and efficient sampling of typical
system configurations. Here we employ a neural autoregressive distribution
estimator (NADE) to boost Markov chain Monte Carlo (MCMC) simulations of a
paradigmatic classical model of spin-glass theory, namely the two-dimensional
Edwards-Anderson Hamiltonian. We show that a NADE can be trained to accurately
mimic the Boltzmann distribution using unsupervised learning from system
configurations generated using standard MCMC algorithms. The trained NADE is
then employed as smart proposal distribution for the Metropolis-Hastings
algorithm. This allows us to perform efficient MCMC simulations, which provide
unbiased results even if the expectation value corresponding to the probability
distribution learned by the NADE is not exact. Notably, we implement a
sequential tempering procedure, whereby a NADE trained at a higher temperature
is iteratively employed as proposal distribution in a MCMC simulation run at a
slightly lower temperature. This allows one to efficiently simulate the
spin-glass model even in the low-temperature regime, avoiding the divergent
correlation times that plague MCMC simulations driven by local-update
algorithms. Furthermore, we show that the NADE-driven simulations quickly
sample ground-state configurations, paving the way to their future utilization
to tackle binary optimization problems.Comment: 13 pages, 14 figure
- …