59 research outputs found
Nested Hierarchical Dirichlet Processes
We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical
topic modeling. The nHDP is a generalization of the nested Chinese restaurant
process (nCRP) that allows each word to follow its own path to a topic node
according to a document-specific distribution on a shared tree. This alleviates
the rigid, single-path formulation of the nCRP, allowing a document to more
easily express thematic borrowings as a random effect. We derive a stochastic
variational inference algorithm for the model, in addition to a greedy subtree
selection method for each document, which allows for efficient inference using
massive collections of text documents. We demonstrate our algorithm on 1.8
million documents from The New York Times and 3.3 million documents from
Wikipedia.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence, Special Issue on Bayesian Nonparametric
Expressive recommender systems through normalized nonnegative models
We introduce normalized nonnegative models (NNM) for explorative data
analysis. NNMs are partial convexifications of models from probability theory.
We demonstrate their value at the example of item recommendation. We show that
NNM-based recommender systems satisfy three criteria that all recommender
systems should ideally satisfy: high predictive power, computational
tractability, and expressive representations of users and items. Expressive
user and item representations are important in practice to succinctly summarize
the pool of customers and the pool of items. In NNMs, user representations are
expressive because each user's preference can be regarded as normalized mixture
of preferences of stereotypical users. The interpretability of item and user
representations allow us to arrange properties of items (e.g., genres of movies
or topics of documents) or users (e.g., personality traits) hierarchically
Unsupervised Terminological Ontology Learning based on Hierarchical Topic Modeling
In this paper, we present hierarchical relationbased latent Dirichlet
allocation (hrLDA), a data-driven hierarchical topic model for extracting
terminological ontologies from a large number of heterogeneous documents. In
contrast to traditional topic models, hrLDA relies on noun phrases instead of
unigrams, considers syntax and document structures, and enriches topic
hierarchies with topic relations. Through a series of experiments, we
demonstrate the superiority of hrLDA over existing topic models, especially for
building hierarchies. Furthermore, we illustrate the robustness of hrLDA in the
settings of noisy data sets, which are likely to occur in many practical
scenarios. Our ontology evaluation results show that ontologies extracted from
hrLDA are very competitive with the ontologies created by domain experts
CINet: A Learning Based Approach to Incremental Context Modeling in Robots
There have been several attempts at modeling context in robots. However,
either these attempts assume a fixed number of contexts or use a rule-based
approach to determine when to increment the number of contexts. In this paper,
we pose the task of when to increment as a learning problem, which we solve
using a Recurrent Neural Network. We show that the network successfully (with
98\% testing accuracy) learns to predict when to increment, and demonstrate, in
a scene modeling problem (where the correct number of contexts is not known),
that the robot increments the number of contexts in an expected manner (i.e.,
the entropy of the system is reduced). We also present how the incremental
model can be used for various scene reasoning tasks.Comment: The first two authors have contributed equally, 6 pages, 8 figures,
International Conference on Intelligent Robots (IROS 2018
A Deep Incremental Boltzmann Machine for Modeling Context in Robots
Context is an essential capability for robots that are to be as adaptive as
possible in challenging environments. Although there are many context modeling
efforts, they assume a fixed structure and number of contexts. In this paper,
we propose an incremental deep model that extends Restricted Boltzmann
Machines. Our model gets one scene at a time, and gradually extends the
contextual model when necessary, either by adding a new context or a new
context layer to form a hierarchy. We show on a scene classification benchmark
that our method converges to a good estimate of the contexts of the scenes, and
performs better or on-par on several tasks compared to other incremental models
or non-incremental models.Comment: 6 pages, 5 figures, International Conference on Robotics and
Automation (ICRA 2018
A Novel Document Generation Process for Topic Detection based on Hierarchical Latent Tree Models
We propose a novel document generation process based on hierarchical latent
tree models (HLTMs) learned from data. An HLTM has a layer of observed word
variables at the bottom and multiple layers of latent variables on top. For
each document, we first sample values for the latent variables layer by layer
via logic sampling, then draw relative frequencies for the words conditioned on
the values of the latent variables, and finally generate words for the document
using the relative word frequencies. The motivation for the work is to take
word counts into consideration with HLTMs. In comparison with LDA-based
hierarchical document generation processes, the new process achieves
drastically better model fit with much fewer parameters. It also yields more
meaningful topics and topic hierarchies. It is the new state-of-the-art for the
hierarchical topic detection
- …