3,164 research outputs found
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
The gap between our ability to collect interesting data and our ability to
analyze these data is growing at an unprecedented rate. Recent algorithmic
attempts to fill this gap have employed unsupervised tools to discover
structure in data. Some of the most successful approaches have used
probabilistic models to uncover latent thematic structure in discrete data.
Despite the success of these models on textual data, they have not generalized
as well to image data, in part because of the spatial and temporal structure
that may exist in an image stream.
We introduce a novel unsupervised machine learning framework that
incorporates the ability of convolutional autoencoders to discover features
from images that directly encode spatial information, within a Bayesian
nonparametric topic model that discovers meaningful latent patterns within
discrete data. By using this hybrid framework, we overcome the fundamental
dependency of traditional topic models on rigidly hand-coded data
representations, while simultaneously encoding spatial dependency in our topics
without adding model complexity. We apply this model to the motivating
application of high-level scene understanding and mission summarization for
exploratory marine robots. Our experiments on a seafloor dataset collected by a
marine robot show that the proposed hybrid framework outperforms current
state-of-the-art approaches on the task of unsupervised seafloor terrain
characterization.Comment: 8 page
Multimodal Hierarchical Dirichlet Process-based Active Perception
In this paper, we propose an active perception method for recognizing object
categories based on the multimodal hierarchical Dirichlet process (MHDP). The
MHDP enables a robot to form object categories using multimodal information,
e.g., visual, auditory, and haptic information, which can be observed by
performing actions on an object. However, performing many actions on a target
object requires a long time. In a real-time scenario, i.e., when the time is
limited, the robot has to determine the set of actions that is most effective
for recognizing a target object. We propose an MHDP-based active perception
method that uses the information gain (IG) maximization criterion and lazy
greedy algorithm. We show that the IG maximization criterion is optimal in the
sense that the criterion is equivalent to a minimization of the expected
Kullback--Leibler divergence between a final recognition state and the
recognition state after the next set of actions. However, a straightforward
calculation of IG is practically impossible. Therefore, we derive an efficient
Monte Carlo approximation method for IG by making use of a property of the
MHDP. We also show that the IG has submodular and non-decreasing properties as
a set function because of the structure of the graphical model of the MHDP.
Therefore, the IG maximization problem is reduced to a submodular maximization
problem. This means that greedy and lazy greedy algorithms are effective and
have a theoretical justification for their performance. We conducted an
experiment using an upper-torso humanoid robot and a second one using synthetic
data. The experimental results show that the method enables the robot to select
a set of actions that allow it to recognize target objects quickly and
accurately. The results support our theoretical outcomes.Comment: submitte
A Novel Document Generation Process for Topic Detection based on Hierarchical Latent Tree Models
We propose a novel document generation process based on hierarchical latent
tree models (HLTMs) learned from data. An HLTM has a layer of observed word
variables at the bottom and multiple layers of latent variables on top. For
each document, we first sample values for the latent variables layer by layer
via logic sampling, then draw relative frequencies for the words conditioned on
the values of the latent variables, and finally generate words for the document
using the relative word frequencies. The motivation for the work is to take
word counts into consideration with HLTMs. In comparison with LDA-based
hierarchical document generation processes, the new process achieves
drastically better model fit with much fewer parameters. It also yields more
meaningful topics and topic hierarchies. It is the new state-of-the-art for the
hierarchical topic detection
Hierarchically Clustered Representation Learning
The joint optimization of representation learning and clustering in the
embedding space has experienced a breakthrough in recent years. In spite of the
advance, clustering with representation learning has been limited to flat-level
categories, which often involves cohesive clustering with a focus on instance
relations. To overcome the limitations of flat clustering, we introduce
hierarchically-clustered representation learning (HCRL), which simultaneously
optimizes representation learning and hierarchical clustering in the embedding
space. Compared with a few prior works, HCRL firstly attempts to consider a
generation of deep embeddings from every component of the hierarchy, not just
leaf components. In addition to obtaining hierarchically clustered embeddings,
we can reconstruct data by the various abstraction levels, infer the intrinsic
hierarchical structure, and learn the level-proportion features. We conducted
evaluations with image and text domains, and our quantitative analyses showed
competent likelihoods and the best accuracies compared with the baselines.Comment: 10 pages, 7 figures, Under review as a conference pape
Factorized Topic Models
In this paper we present a modification to a latent topic model, which makes
the model exploit supervision to produce a factorized representation of the
observed data. The structured parameterization separately encodes variance that
is shared between classes from variance that is private to each class by the
introduction of a new prior over the topic space. The approach allows for a
more eff{}icient inference and provides an intuitive interpretation of the data
in terms of an informative signal together with structured noise. The
factorized representation is shown to enhance inference performance for image,
text, and video classification.Comment: ICLR 201
- …