1,671 research outputs found
Discovering Discrete Latent Topics with Neural Variational Inference
Topic models have been widely explored as probabilistic generative models of
documents. Traditional inference methods have sought closed-form derivations
for updating the models, however as the expressiveness of these models grows,
so does the difficulty of performing fast and accurate inference over their
parameters. This paper presents alternative neural approaches to topic
modelling by providing parameterisable distributions over topics which permit
training by backpropagation in the framework of neural variational inference.
In addition, with the help of a stick-breaking construction, we propose a
recurrent network that is able to discover a notionally unbounded number of
topics, analogous to Bayesian non-parametric topic models. Experimental results
on the MXM Song Lyrics, 20NewsGroups and Reuters News datasets demonstrate the
effectiveness and efficiency of these neural topic models.Comment: ICML 201
MedLDA: A General Framework of Maximum Margin Supervised Topic Models
Supervised topic models utilize document's side information for discovering
predictive low dimensional representations of documents. Existing models apply
the likelihood-based estimation. In this paper, we present a general framework
of max-margin supervised topic models for both continuous and categorical
response variables. Our approach, the maximum entropy discrimination latent
Dirichlet allocation (MedLDA), utilizes the max-margin principle to train
supervised topic models and estimate predictive topic representations that are
arguably more suitable for prediction tasks. The general principle of MedLDA
can be applied to perform joint max-margin learning and maximum likelihood
estimation for arbitrary topic models, directed or undirected, and supervised
or unsupervised, when the supervised side information is available. We develop
efficient variational methods for posterior inference and parameter estimation,
and demonstrate qualitatively and quantitatively the advantages of MedLDA over
likelihood-based topic models on movie review and 20 Newsgroups data sets.Comment: 27 Page
Unsupervised Dialog Structure Learning
Learning a shared dialog structure from a set of task-oriented dialogs is an
important challenge in computational linguistics. The learned dialog structure
can shed light on how to analyze human dialogs, and more importantly contribute
to the design and evaluation of dialog systems. We propose to extract dialog
structures using a modified VRNN model with discrete latent vectors. Different
from existing HMM-based models, our model is based on variational-autoencoder
(VAE). Such model is able to capture more dynamics in dialogs beyond the
surface forms of the language. We find that qualitatively, our method extracts
meaningful dialog structure, and quantitatively, outperforms previous models on
the ability to predict unseen data. We further evaluate the model's
effectiveness in a downstream task, the dialog system building task.
Experiments show that, by integrating the learned dialog structure into the
reward function design, the model converges faster and to a better outcome in a
reinforcement learning setting.Comment: Long paper accepted by NAACL 201
Variational Autoencoders for Sparse and Overdispersed Discrete Data
Many applications, such as text modelling, high-throughput sequencing, and
recommender systems, require analysing sparse, high-dimensional, and
overdispersed discrete (count-valued or binary) data. Although probabilistic
matrix factorisation and linear/nonlinear latent factor models have enjoyed
great success in modelling such data, many existing models may have inferior
modelling performance due to the insufficient capability of modelling
overdispersion in count-valued data and model misspecification in general. In
this paper, we comprehensively study these issues and propose a variational
autoencoder based framework that generates discrete data via negative-binomial
distribution. We also examine the model's ability to capture properties, such
as self- and cross-excitations in discrete data, which is critical for
modelling overdispersion. We conduct extensive experiments on three important
problems from discrete data analysis: text analysis, collaborative filtering,
and multi-label learning. Compared with several state-of-the-art baselines, the
proposed models achieve significantly better performance on the above problems
Topic Memory Networks for Short Text Classification
Many classification models work poorly on short texts due to data sparsity.
To address this issue, we propose topic memory networks for short text
classification with a novel topic memory mechanism to encode latent topic
representations indicative of class labels. Different from most prior work that
focuses on extending features with external knowledge or pre-trained topics,
our model jointly explores topic inference and text classification with memory
networks in an end-to-end manner. Experimental results on four benchmark
datasets show that our model outperforms state-of-the-art models on short text
classification, meanwhile generates coherent topics.Comment: EMNLP 201
Familia: A Configurable Topic Modeling Framework for Industrial Text Engineering
In the last decade, a variety of topic models have been proposed for text
engineering. However, except Probabilistic Latent Semantic Analysis (PLSA) and
Latent Dirichlet Allocation (LDA), most of existing topic models are seldom
applied or considered in industrial scenarios. This phenomenon is caused by the
fact that there are very few convenient tools to support these topic models so
far. Intimidated by the demanding expertise and labor of designing and
implementing parameter inference algorithms, software engineers are prone to
simply resort to PLSA/LDA, without considering whether it is proper for their
problem at hand or not. In this paper, we propose a configurable topic modeling
framework named Familia, in order to bridge the huge gap between academic
research fruits and current industrial practice. Familia supports an important
line of topic models that are widely applicable in text engineering scenarios.
In order to relieve burdens of software engineers without knowledge of Bayesian
networks, Familia is able to conduct automatic parameter inference for a
variety of topic models. Simply through changing the data organization of
Familia, software engineers are able to easily explore a broad spectrum of
existing topic models or even design their own topic models, and find the one
that best suits the problem at hand. With its superior extendability, Familia
has a novel sampling mechanism that strikes balance between effectiveness and
efficiency of parameter inference. Furthermore, Familia is essentially a big
topic modeling framework that supports parallel parameter inference and
distributed parameter storage. The utilities and necessity of Familia are
demonstrated in real-life industrial applications. Familia would significantly
enlarge software engineers' arsenal of topic models and pave the way for
utilizing highly customized topic models in real-life problems.Comment: 21 pages, 15 figure
Structured Neural Topic Models for Reviews
We present Variational Aspect-based Latent Topic Allocation (VALTA), a family
of autoencoding topic models that learn aspect-based representations of
reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for
combined reviews associated with each paired user and item onto structured
embeddings, which in turn define per-aspect topic weights. We model individual
reviews in a structured manner by inferring an aspect assignment for each
sentence in a given review, where the per-aspect topic weights obtained by the
user-item encoder serve to define a mixture over topics, conditioned on the
aspect. The result is an autoencoding neural topic model for reviews, which can
be trained in a fully unsupervised manner to learn topics that are structured
into aspects. Experimental evaluation on large number of datasets demonstrates
that aspects are interpretable, yield higher coherence scores than
non-structured autoencoding topic model variants, and can be utilized to
perform aspect-based comparison and genre discovery
Unsupervised and interpretable scene discovery with Discrete-Attend-Infer-Repeat
In this work we present Discrete Attend Infer Repeat (Discrete-AIR), a
Recurrent Auto-Encoder with structured latent distributions containing discrete
categorical distributions, continuous attribute distributions, and factorised
spatial attention. While inspired by the original AIR model andretaining AIR
model's capability in identifying objects in an image, Discrete-AIR provides
direct interpretability of the latent codes. We show that for Multi-MNIST and a
multiple-objects version of dSprites dataset, the Discrete-AIR model needs just
one categorical latent variable, one attribute variable (for Multi-MNIST only),
together with spatial attention variables, for efficient inference. We perform
analysis to show that the learnt categorical distributions effectively capture
the categories of objects in the scene for Multi-MNIST and for Multi-Sprites
A Tutorial on Deep Latent Variable Models of Natural Language
There has been much recent, exciting work on combining the complementary
strengths of latent variable models and deep learning. Latent variable modeling
makes it easy to explicitly specify model constraints through conditional
independence properties, while deep learning makes it possible to parameterize
these conditional likelihoods with powerful function approximators. While these
"deep latent variable" models provide a rich, flexible framework for modeling
many real-world phenomena, difficulties exist: deep parameterizations of
conditional likelihoods usually make posterior inference intractable, and
latent variable objectives often complicate backpropagation by introducing
points of non-differentiability. This tutorial explores these issues in depth
through the lens of variational inference.Comment: EMNLP 2018 Tutoria
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations
This paper presents an unsupervised framework for jointly modeling topic
content and discourse behavior in microblog conversations. Concretely, we
propose a neural model to discover word clusters indicating what a conversation
concerns (i.e., topics) and those reflecting how participants voice their
opinions (i.e., discourse). Extensive experiments show that our model can yield
both coherent topics and meaningful discourse behavior. Further study shows
that our topic and discourse representations can benefit the classification of
microblog messages, especially when they are jointly trained with the
classifier.Comment: Accepted in Transactions of the Association for Computational
Linguistic
- …