8,512 research outputs found
Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization
Semantic specialization is the process of fine-tuning pre-trained
distributional word vectors using external lexical knowledge (e.g., WordNet) to
accentuate a particular semantic relation in the specialized vector space.
While post-processing specialization methods are applicable to arbitrary
distributional vectors, they are limited to updating only the vectors of words
occurring in external lexicons (i.e., seen words), leaving the vectors of all
other words unchanged. We propose a novel approach to specializing the full
distributional vocabulary. Our adversarial post-specialization method
propagates the external lexical knowledge to the full distributional space. We
exploit words seen in the resources as training examples for learning a global
specialization function. This function is learned by combining a standard
L2-distance loss with an adversarial loss: the adversarial component produces
more realistic output vectors. We show the effectiveness and robustness of the
proposed method across three languages and on three tasks: word similarity,
dialog state tracking, and lexical simplification. We report consistent
improvements over distributional word vectors and vectors specialized by other
state-of-the-art specialization frameworks. Finally, we also propose a
cross-lingual transfer method for zero-shot specialization which successfully
specializes a full target distributional space without any lexical knowledge in
the target language and without any bilingual data.Comment: Accepted at EMNLP 201
Semantic Embedding Space for Zero-Shot Action Recognition
The number of categories for action recognition is growing rapidly. It is
thus becoming increasingly hard to collect sufficient training data to learn
conventional models for each category. This issue may be ameliorated by the
increasingly popular 'zero-shot learning' (ZSL) paradigm. In this framework a
mapping is constructed between visual features and a human interpretable
semantic description of each category, allowing categories to be recognised in
the absence of any training data. Existing ZSL studies focus primarily on image
data, and attribute-based semantic representations. In this paper, we address
zero-shot recognition in contemporary video action recognition tasks, using
semantic word vector space as the common space to embed videos and category
labels. This is more challenging because the mapping between the semantic space
and space-time features of videos containing complex actions is more complex
and harder to learn. We demonstrate that a simple self-training and data
augmentation strategy can significantly improve the efficacy of this mapping.
Experiments on human action datasets including HMDB51 and UCF101 demonstrate
that our approach achieves the state-of-the-art zero-shot action recognition
performance.Comment: 5 page
Zero-Shot Learning by Convex Combination of Semantic Embeddings
Several recent publications have proposed methods for mapping images into
continuous semantic embedding spaces. In some cases the embedding space is
trained jointly with the image transformation. In other cases the semantic
embedding space is established by an independent natural language processing
task, and then the image transformation into that space is learned in a second
stage. Proponents of these image embedding systems have stressed their
advantages over the traditional \nway{} classification framing of image
understanding, particularly in terms of the promise for zero-shot learning --
the ability to correctly annotate images of previously unseen object
categories. In this paper, we propose a simple method for constructing an image
embedding system from any existing \nway{} image classifier and a semantic word
embedding model, which contains the \n class labels in its vocabulary. Our
method maps images into the semantic embedding space via convex combination of
the class label embedding vectors, and requires no additional training. We show
that this simple and direct method confers many of the advantages associated
with more complex image embedding schemes, and indeed outperforms state of the
art methods on the ImageNet zero-shot learning task
- …