78 research outputs found
Combining Word Feature Vector Method with the Convolutional Neural Network for Slot Filling in Spoken Language Understanding
Slot filling is an important problem in Spoken Language Understanding (SLU)
and Natural Language Processing (NLP), which involves identifying a user's
intent and assigning a semantic concept to each word in a sentence. This paper
presents a word feature vector method and combines it into the convolutional
neural network (CNN). We consider 18 word features and each word feature is
constructed by merging similar word labels. By introducing the concept of
external library, we propose a feature set approach that is beneficial for
building the relationship between a word from the training dataset and the
feature. Computational results are reported using the ATIS dataset and
comparisons with traditional CNN as well as bi-directional sequential CNN are
also presented
End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning
In this paper, we present a neural network based task-oriented dialogue
system that can be optimized end-to-end with deep reinforcement learning (RL).
The system is able to track dialogue state, interface with knowledge bases, and
incorporate query results into agent's responses to successfully complete
task-oriented dialogues. Dialogue policy learning is conducted with a hybrid
supervised and deep RL methods. We first train the dialogue agent in a
supervised manner by learning directly from task-oriented dialogue corpora, and
further optimize it with deep RL during its interaction with users. In the
experiments on two different dialogue task domains, our model demonstrates
robust performance in tracking dialogue state and producing reasonable system
responses. We show that deep RL based optimization leads to significant
improvement on task success rate and reduction in dialogue length comparing to
supervised training model. We further show benefits of training task-oriented
dialogue model end-to-end comparing to component-wise optimization with
experiment results on dialogue simulations and human evaluations
Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding
The goal of this paper is to learn cross-domain representations for slot
filling task in spoken language understanding (SLU). Most of the recently
published SLU models are domain-specific ones that work on individual task
domains. Annotating data for each individual task domain is both financially
costly and non-scalable. In this work, we propose an adversarial training
method in learning common features and representations that can be shared
across multiple domains. Model that produces such shared representations can be
combined with models trained on individual domain SLU data to reduce the amount
of training samples required for developing a new domain. In our experiments
using data sets from multiple domains, we show that adversarial training helps
in learning better domain-general SLU models, leading to improved slot filling
F1 scores. We further show that applying adversarial learning on domain-general
model also helps in achieving higher slot filling performance when the model is
jointly optimized with domain-specific models
Neural CRF transducers for sequence labeling
Conditional random fields (CRFs) have been shown to be one of the most
successful approaches to sequence labeling. Various linear-chain neural CRFs
(NCRFs) are developed to implement the non-linear node potentials in CRFs, but
still keeping the linear-chain hidden structure. In this paper, we propose NCRF
transducers, which consists of two RNNs, one extracting features from
observations and the other capturing (theoretically infinite) long-range
dependencies between labels. Different sequence labeling methods are evaluated
over POS tagging, chunking and NER (English, Dutch). Experiment results show
that NCRF transducers achieve consistent improvements over linear-chain NCRFs
and RNN transducers across all the four tasks, and can improve state-of-the-art
results
Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding
We investigate the usage of convolutional neural networks (CNNs) for the slot
filling task in spoken language understanding. We propose a novel CNN
architecture for sequence labeling which takes into account the previous
context words with preserved order information and pays special attention to
the current word with its surrounding context. Moreover, it combines the
information from the past and the future words for classification. Our proposed
CNN architecture outperforms even the previously best ensembling recurrent
neural network model and achieves state-of-the-art results with an F1-score of
95.61% on the ATIS benchmark dataset without using any additional linguistic
knowledge and resources.Comment: Accepted at Interspeech 201
Speech Model Pre-training for End-to-End Spoken Language Understanding
Whereas conventional spoken language understanding (SLU) systems map speech
to text, and then text to intent, end-to-end SLU systems map speech directly to
intent through a single trainable model. Achieving high accuracy with these
end-to-end models without a large amount of training data is difficult. We
propose a method to reduce the data requirements of end-to-end SLU in which the
model is first pre-trained to predict words and phonemes, thus learning good
features for SLU. We introduce a new SLU dataset, Fluent Speech Commands, and
show that our method improves performance both when the full dataset is used
for training and when only a small subset is used. We also describe preliminary
experiments to gauge the model's ability to generalize to new phrases not heard
during training.Comment: Accepted to Interspeech 201
Combining Textual Content and Structure to Improve Dialog Similarity
Chatbots, taking advantage of the success of the messaging apps and recent
advances in Artificial Intelligence, have become very popular, from helping
business to improve customer services to chatting to users for the sake of
conversation and engagement (celebrity or personal bots). However, developing
and improving a chatbot requires understanding their data generated by its
users. Dialog data has a different nature of a simple question and answering
interaction, in which context and temporal properties (turn order) creates a
different understanding of such data. In this paper, we propose a novelty
metric to compute dialogs' similarity based not only on the text content but
also on the information related to the dialog structure. Our experimental
results performed over the Switchboard dataset show that using evidence from
both textual content and the dialog structure leads to more accurate results
than using each measure in isolation.Comment: 5 page
FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning
Understanding and following directions provided by humans can enable robots
to navigate effectively in unknown situations. We present FollowNet, an
end-to-end differentiable neural architecture for learning multi-modal
navigation policies. FollowNet maps natural language instructions as well as
visual and depth inputs to locomotion primitives. FollowNet processes
instructions using an attention mechanism conditioned on its visual and depth
input to focus on the relevant parts of the command while performing the
navigation task. Deep reinforcement learning (RL) a sparse reward learns
simultaneously the state representation, the attention function, and control
policies. We evaluate our agent on a dataset of complex natural language
directions that guide the agent through a rich and realistic dataset of
simulated homes. We show that the FollowNet agent learns to execute previously
unseen instructions described with a similar vocabulary, and successfully
navigates along paths not encountered during training. The agent shows 30%
improvement over a baseline model without the attention mechanism, with 52%
success rate at novel instructions.Comment: 7 pages, 8 figure
Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction
This paper addresses the problem of key phrase extraction from sentences.
Existing state-of-the-art supervised methods require large amounts of annotated
data to achieve good performance and generalization. Collecting labeled data
is, however, often expensive. In this paper, we redefine the problem as
question-answer extraction, and present SAMIE: Self-Asking Model for
Information Ixtraction, a semi-supervised model which dually learns to ask and
to answer questions by itself. Briefly, given a sentence and an answer ,
the model needs to choose the most appropriate question ; meanwhile,
for the given sentence and same question selected in the previous
step, the model will predict an answer . The model can support few-shot
learning with very limited supervision. It can also be used to perform
clustering analysis when no supervision is provided. Experimental results show
that the proposed method outperforms typical supervised methods especially when
given little labeled data.Comment: 7 pages, 5 figures, submission to IJCAI1
Elastic CRFs for Open-ontology Slot Filling
Slot filling is a crucial component in task-oriented dialog systems, which is
to parse (user) utterances into semantic concepts called slots. An ontology is
defined by the collection of slots and the values that each slot can take. The
widely-used practice of treating slot filling as a sequence labeling task
suffers from two drawbacks. First, the ontology is usually pre-defined and
fixed. Most current methods are unable to predict new labels for unseen slots.
Second, the one-hot encoding of slot labels ignores the semantic meanings and
relations for slots, which are implicit in their natural language descriptions.
These observations motivate us to propose a novel model called elastic
conditional random field (eCRF), for open-ontology slot filling. eCRFs can
leverage the neural features of both the utterance and the slot descriptions,
and are able to model the interactions between different slots. Experimental
results show that eCRFs outperforms existing models on both the in-domain and
the cross-doamin tasks, especially in predictions of unseen slots and values.Comment: 5 page
- …