17,739 research outputs found
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Recently, there has been a lot of interest in automatically generating
descriptions for an image. Most existing language-model based approaches for
this task learn to generate an image description word by word in its original
word order. However, for humans, it is more natural to locate the objects and
their relationships first, and then elaborate on each object, describing
notable attributes. We present a coarse-to-fine method that decomposes the
original image description into a skeleton sentence and its attributes, and
generates the skeleton sentence and attribute phrases separately. By this
decomposition, our method can generate more accurate and novel descriptions
than the previous state-of-the-art. Experimental results on the MS-COCO and a
larger scale Stock3M datasets show that our algorithm yields consistent
improvements across different evaluation metrics, especially on the SPICE
metric, which has much higher correlation with human ratings than the
conventional metrics. Furthermore, our algorithm can generate descriptions with
varied length, benefiting from the separate control of the skeleton and
attributes. This enables image description generation that better accommodates
user preferences.Comment: Accepted by CVPR 201
Integrating Semantic Knowledge to Tackle Zero-shot Text Classification
Insufficient or even unavailable training data of emerging classes is a big
challenge of many classification tasks, including text classification.
Recognising text documents of classes that have never been seen in the learning
stage, so-called zero-shot text classification, is therefore difficult and only
limited previous works tackled this problem. In this paper, we propose a
two-phase framework together with data augmentation and feature augmentation to
solve this problem. Four kinds of semantic knowledge (word embeddings, class
descriptions, class hierarchy, and a general knowledge graph) are incorporated
into the proposed framework to deal with instances of unseen classes
effectively. Experimental results show that each and the combination of the two
phases achieve the best overall accuracy compared with baselines and recent
approaches in classifying real-world texts under the zero-shot scenario.Comment: Accepted NAACL-HLT 201
Semantic Tagging with Deep Residual Networks
We propose a novel semantic tagging task, sem-tagging, tailored for the
purpose of multilingual semantic parsing, and present the first tagger using
deep residual networks (ResNets). Our tagger uses both word and character
representations and includes a novel residual bypass architecture. We evaluate
the tagset both intrinsically on the new task of semantic tagging, as well as
on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an
auxiliary loss function predicting our semantic tags, significantly outperforms
prior results on English Universal Dependencies POS tagging (95.71% accuracy on
UD v1.2 and 95.67% accuracy on UD v1.3).Comment: COLING 2016, camera ready versio
- …