1,054 research outputs found
Multimodal Hierarchical Dirichlet Process-based Active Perception
In this paper, we propose an active perception method for recognizing object
categories based on the multimodal hierarchical Dirichlet process (MHDP). The
MHDP enables a robot to form object categories using multimodal information,
e.g., visual, auditory, and haptic information, which can be observed by
performing actions on an object. However, performing many actions on a target
object requires a long time. In a real-time scenario, i.e., when the time is
limited, the robot has to determine the set of actions that is most effective
for recognizing a target object. We propose an MHDP-based active perception
method that uses the information gain (IG) maximization criterion and lazy
greedy algorithm. We show that the IG maximization criterion is optimal in the
sense that the criterion is equivalent to a minimization of the expected
Kullback--Leibler divergence between a final recognition state and the
recognition state after the next set of actions. However, a straightforward
calculation of IG is practically impossible. Therefore, we derive an efficient
Monte Carlo approximation method for IG by making use of a property of the
MHDP. We also show that the IG has submodular and non-decreasing properties as
a set function because of the structure of the graphical model of the MHDP.
Therefore, the IG maximization problem is reduced to a submodular maximization
problem. This means that greedy and lazy greedy algorithms are effective and
have a theoretical justification for their performance. We conducted an
experiment using an upper-torso humanoid robot and a second one using synthetic
data. The experimental results show that the method enables the robot to select
a set of actions that allow it to recognize target objects quickly and
accurately. The results support our theoretical outcomes.Comment: submitte
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Integrating Document Clustering and Topic Modeling
Document clustering and topic modeling are two closely related tasks which
can mutually benefit each other. Topic modeling can project documents into a
topic space which facilitates effective document clustering. Cluster labels
discovered by document clustering can be incorporated into topic models to
extract local topics specific to each cluster and global topics shared by all
clusters. In this paper, we propose a multi-grain clustering topic model
(MGCTM) which integrates document clustering and topic modeling into a unified
framework and jointly performs the two tasks to achieve the overall best
performance. Our model tightly couples two components: a mixture component used
for discovering latent groups in document collection and a topic model
component used for mining multi-grain topics including local topics specific to
each cluster and global topics shared across clusters.We employ variational
inference to approximate the posterior of hidden variables and learn model
parameters. Experiments on two datasets demonstrate the effectiveness of our
model.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Dirichlet belief networks for topic structure learning
Recently, considerable research effort has been devoted to developing deep
architectures for topic models to learn topic structures. Although several deep
models have been proposed to learn better topic proportions of documents, how
to leverage the benefits of deep structures for learning word distributions of
topics has not yet been rigorously studied. Here we propose a new multi-layer
generative process on word distributions of topics, where each layer consists
of a set of topics and each topic is drawn from a mixture of the topics of the
layer above. As the topics in all layers can be directly interpreted by words,
the proposed model is able to discover interpretable topic hierarchies. As a
self-contained module, our model can be flexibly adapted to different kinds of
topic models to improve their modelling accuracy and interpretability.
Extensive experiments on text corpora demonstrate the advantages of the
proposed model.Comment: accepted in NIPS 201
- …