54,869 research outputs found
Stochastic Divergence Minimization for Biterm Topic Model
As the emergence and the thriving development of social networks, a huge
number of short texts are accumulated and need to be processed. Inferring
latent topics of collected short texts is useful for understanding its hidden
structure and predicting new contents. Unlike conventional topic models such as
latent Dirichlet allocation (LDA), a biterm topic model (BTM) was recently
proposed for short texts to overcome the sparseness of document-level word
co-occurrences by directly modeling the generation process of word pairs.
Stochastic inference algorithms based on collapsed Gibbs sampling (CGS) and
collapsed variational inference have been proposed for BTM. However, they
either require large computational complexity, or rely on very crude
estimation. In this work, we develop a stochastic divergence minimization
inference algorithm for BTM to estimate latent topics more accurately in a
scalable way. Experiments demonstrate the superiority of our proposed algorithm
compared with existing inference algorithms.Comment: 19 pages, 4 figure
Zero-Shot Hashing via Transferring Supervised Knowledge
Hashing has shown its efficiency and effectiveness in facilitating
large-scale multimedia applications. Supervised knowledge e.g. semantic labels
or pair-wise relationship) associated to data is capable of significantly
improving the quality of hash codes and hash functions. However, confronted
with the rapid growth of newly-emerging concepts and multimedia data on the
Web, existing supervised hashing approaches may easily suffer from the scarcity
and validity of supervised information due to the expensive cost of manual
labelling. In this paper, we propose a novel hashing scheme, termed
\emph{zero-shot hashing} (ZSH), which compresses images of "unseen" categories
to binary codes with hash functions learned from limited training data of
"seen" categories. Specifically, we project independent data labels i.e.
0/1-form label vectors) into semantic embedding space, where semantic
relationships among all the labels can be precisely characterized and thus seen
supervised knowledge can be transferred to unseen classes. Moreover, in order
to cope with the semantic shift problem, we rotate the embedded space to more
suitably align the embedded semantics with the low-level visual feature space,
thereby alleviating the influence of semantic gap. In the meantime, to exert
positive effects on learning high-quality hash functions, we further propose to
preserve local structural property and discrete nature in binary codes.
Besides, we develop an efficient alternating algorithm to solve the ZSH model.
Extensive experiments conducted on various real-life datasets show the superior
zero-shot image retrieval performance of ZSH as compared to several
state-of-the-art hashing methods.Comment: 11 page
D-Bees: A Novel Method Inspired by Bee Colony Optimization for Solving Word Sense Disambiguation
Word sense disambiguation (WSD) is a problem in the field of computational
linguistics given as finding the intended sense of a word (or a set of words)
when it is activated within a certain context. WSD was recently addressed as a
combinatorial optimization problem in which the goal is to find a sequence of
senses that maximize the semantic relatedness among the target words. In this
article, a novel algorithm for solving the WSD problem called D-Bees is
proposed which is inspired by bee colony optimization (BCO)where artificial bee
agents collaborate to solve the problem. The D-Bees algorithm is evaluated on a
standard dataset (SemEval 2007 coarse-grained English all-words task corpus)and
is compared to simulated annealing, genetic algorithms, and two ant colony
optimization techniques (ACO). It will be observed that the BCO and ACO
approaches are on par
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is
generally considered a crucial first step in processing Japanese texts. Typical
Japanese segmentation algorithms rely either on a lexicon and syntactic
analysis or on pre-segmented data; but these are labor-intensive, and the
lexico-syntactic techniques are vulnerable to the unknown word problem. In
contrast, we introduce a novel, more robust statistical method utilizing
unsegmented training data. Despite its simplicity, the algorithm yields
performance on long kanji sequences comparable to and sometimes surpassing that
of state-of-the-art morphological analyzers over a variety of error metrics.
The algorithm also outperforms another mostly-unsupervised statistical
algorithm previously proposed for Chinese.
Additionally, we present a two-level annotation scheme for Japanese to
incorporate multiple segmentation granularities, and introduce two novel
evaluation metrics, both based on the notion of a compatible bracket, that can
account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin
ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing
In this paper, we present a novel unsupervised algorithm for word sense
disambiguation (WSD) at the document level. Our algorithm is inspired by a
widely-used approach in the field of genetics for whole genome sequencing,
known as the Shotgun sequencing technique. The proposed WSD algorithm is based
on three main steps. First, a brute-force WSD algorithm is applied to short
context windows (up to 10 words) selected from the document in order to
generate a short list of likely sense configurations for each window. In the
second step, these local sense configurations are assembled into longer
composite configurations based on suffix and prefix matching. The resulted
configurations are ranked by their length, and the sense of each word is chosen
based on a voting scheme that considers only the top k configurations in which
the word appears. We compare our algorithm with other state-of-the-art
unsupervised WSD algorithms and demonstrate better performance, sometimes by a
very large margin. We also show that our algorithm can yield better performance
than the Most Common Sense (MCS) baseline on one data set. Moreover, our
algorithm has a very small number of parameters, is robust to parameter tuning,
and, unlike other bio-inspired methods, it gives a deterministic solution (it
does not involve random choices).Comment: In Proceedings of EACL 201
- …