7,138 research outputs found
Syntactic Topic Models
The syntactic topic model (STM) is a Bayesian nonparametric model of language
that discovers latent distributions of words (topics) that are both
semantically and syntactically coherent. The STM models dependency parsed
corpora where sentences are grouped into documents. It assumes that each word
is drawn from a latent topic chosen by combining document-level features and
the local syntactic context. Each document has a distribution over latent
topics, as in topic models, which provides the semantic consistency. Each
element in the dependency parse tree also has a distribution over the topics of
its children, as in latent-state syntax models, which provides the syntactic
consistency. These distributions are convolved so that the topic of each word
is likely under both its document and syntactic context. We derive a fast
posterior inference algorithm based on variational methods. We report
qualitative and quantitative studies on both synthetic data and hand-parsed
documents. We show that the STM is a more predictive model of language than
current models based only on syntax or only on topics
Hierarchical relational models for document networks
We develop the relational topic model (RTM), a hierarchical model of both
network structure and node attributes. We focus on document networks, where the
attributes of each document are its words, that is, discrete observations taken
from a fixed vocabulary. For each pair of documents, the RTM models their link
as a binary random variable that is conditioned on their contents. The model
can be used to summarize a network of documents, predict links between them,
and predict words within them. We derive efficient inference and estimation
algorithms based on variational methods that take advantage of sparsity and
scale with the number of links. We evaluate the predictive performance of the
RTM for large networks of scientific abstracts, web documents, and
geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models
Learning in deep models using Bayesian methods has generated significant
attention recently. This is largely because of the feasibility of modern
Bayesian methods to yield scalable learning and inference, while maintaining a
measure of uncertainty in the model parameters. Stochastic gradient MCMC
algorithms (SG-MCMC) are a family of diffusion-based sampling methods for
large-scale Bayesian learning. In SG-MCMC, multivariate stochastic gradient
thermostats (mSGNHT) augment each parameter of interest, with a momentum and a
thermostat variable to maintain stationary distributions as target posterior
distributions. As the number of variables in a continuous-time diffusion
increases, its numerical approximation error becomes a practical bottleneck, so
better use of a numerical integrator is desirable. To this end, we propose use
of an efficient symmetric splitting integrator in mSGNHT, instead of the
traditional Euler integrator. We demonstrate that the proposed scheme is more
accurate, robust, and converges faster. These properties are demonstrated to be
desirable in Bayesian deep learning. Extensive experiments on two canonical
models and their deep extensions demonstrate that the proposed scheme improves
general Bayesian posterior sampling, particularly for deep models.Comment: AAAI 201
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Policy gradient methods have had great success in solving continuous control
tasks, yet the stochastic nature of such problems makes deterministic value
estimation difficult. We propose an approach which instead estimates a
distribution by fitting the value function with a Bayesian Neural Network. We
optimize an -divergence objective with Bayesian dropout approximation
to learn and estimate this distribution. We show that using the Monte Carlo
posterior mean of the Bayesian value function distribution, rather than a
deterministic network, improves stability and performance of policy gradient
methods in continuous control MuJoCo simulations.Comment: Accepted to Bayesian Deep Learning Workshop at NIPS 201
- …