1,411 research outputs found
Learning a Recurrent Visual Representation for Image Caption Generation
In this paper we explore the bi-directional mapping between images and their
sentence-based descriptions. We propose learning this mapping using a recurrent
neural network. Unlike previous approaches that map both sentences and images
to a common embedding, we enable the generation of novel sentences given an
image. Using the same model, we can also reconstruct the visual features
associated with an image given its visual description. We use a novel recurrent
visual memory that automatically learns to remember long-term visual concepts
to aid in both sentence generation and visual feature reconstruction. We
evaluate our approach on several tasks. These include sentence generation,
sentence retrieval and image retrieval. State-of-the-art results are shown for
the task of generating novel image descriptions. When compared to human
generated captions, our automatically generated captions are preferred by
humans over of the time. Results are better than or comparable to
state-of-the-art results on the image and sentence retrieval tasks for methods
using similar visual features
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
State-of-the-art slot filling models for goal-oriented human/machine
conversational language understanding systems rely on deep learning methods.
While multi-task training of such models alleviates the need for large
in-domain annotated datasets, bootstrapping a semantic parsing model for a new
domain using only the semantic frame, such as the back-end API or knowledge
graph schema, is still one of the holy grail tasks of language understanding
for dialogue systems. This paper proposes a deep learning based approach that
can utilize only the slot description in context without the need for any
labeled or unlabeled in-domain examples, to quickly bootstrap a new domain. The
main idea of this paper is to leverage the encoding of the slot names and
descriptions within a multi-task deep learned slot filling model, to implicitly
align slots across domains. The proposed approach is promising for solving the
domain scaling problem and eliminating the need for any manually annotated data
or explicit schema alignment. Furthermore, our experiments on multiple domains
show that this approach results in significantly better slot-filling
performance when compared to using only in-domain data, especially in the low
data regime.Comment: 4 pages + 1 reference
Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant
Slot filling is a critical task in natural language understanding (NLU) for
dialog systems. State-of-the-art approaches treat it as a sequence labeling
problem and adopt such models as BiLSTM-CRF. While these models work relatively
well on standard benchmark datasets, they face challenges in the context of
E-commerce where the slot labels are more informative and carry richer
expressions. In this work, inspired by the unique structure of E-commerce
knowledge base, we propose a novel multi-task model with cascade and residual
connections, which jointly learns segment tagging, named entity tagging and
slot filling. Experiments show the effectiveness of the proposed cascade and
residual structures. Our model has a 14.6% advantage in F1 score over the
strong baseline methods on a new Chinese E-commerce shopping assistant dataset,
while achieving competitive accuracies on a standard dataset. Furthermore,
online test deployed on such dominant E-commerce platform shows 130%
improvement on accuracy of understanding user utterances. Our model has already
gone into production in the E-commerce platform.Comment: AAAI 201
- …