433 research outputs found
Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant
Slot filling is a critical task in natural language understanding (NLU) for
dialog systems. State-of-the-art approaches treat it as a sequence labeling
problem and adopt such models as BiLSTM-CRF. While these models work relatively
well on standard benchmark datasets, they face challenges in the context of
E-commerce where the slot labels are more informative and carry richer
expressions. In this work, inspired by the unique structure of E-commerce
knowledge base, we propose a novel multi-task model with cascade and residual
connections, which jointly learns segment tagging, named entity tagging and
slot filling. Experiments show the effectiveness of the proposed cascade and
residual structures. Our model has a 14.6% advantage in F1 score over the
strong baseline methods on a new Chinese E-commerce shopping assistant dataset,
while achieving competitive accuracies on a standard dataset. Furthermore,
online test deployed on such dominant E-commerce platform shows 130%
improvement on accuracy of understanding user utterances. Our model has already
gone into production in the E-commerce platform.Comment: AAAI 201
Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding
The goal of this paper is to learn cross-domain representations for slot
filling task in spoken language understanding (SLU). Most of the recently
published SLU models are domain-specific ones that work on individual task
domains. Annotating data for each individual task domain is both financially
costly and non-scalable. In this work, we propose an adversarial training
method in learning common features and representations that can be shared
across multiple domains. Model that produces such shared representations can be
combined with models trained on individual domain SLU data to reduce the amount
of training samples required for developing a new domain. In our experiments
using data sets from multiple domains, we show that adversarial training helps
in learning better domain-general SLU models, leading to improved slot filling
F1 scores. We further show that applying adversarial learning on domain-general
model also helps in achieving higher slot filling performance when the model is
jointly optimized with domain-specific models
Parsing Coordination for Spoken Language Understanding
Typical spoken language understanding systems provide narrow semantic parses
using a domain-specific ontology. The parses contain intents and slots that are
directly consumed by downstream domain applications. In this work we discuss
expanding such systems to handle compound entities and intents by introducing a
domain-agnostic shallow parser that handles linguistic coordination. We show
that our model for parsing coordination learns domain-independent and
slot-independent features and is able to segment conjunct boundaries of many
different phrasal categories. We also show that using adversarial training can
be effective for improving generalization across different slot types for
coordination parsing.Comment: The paper was published in SLT 2018 conferenc
Efficient Large-Scale Domain Classification with Personalized Attention
In this paper, we explore the task of mapping spoken language utterances to
one of thousands of natural language understanding domains in intelligent
personal digital assistants (IPDAs). This scenario is observed for many
mainstream IPDAs in industry that allow third parties to develop thousands of
new domains to augment built-in ones to rapidly increase domain coverage and
overall IPDA capabilities. We propose a scalable neural model architecture with
a shared encoder, a novel attention mechanism that incorporates personalization
information and domain-specific classifiers that solves the problem
efficiently. Our architecture is designed to efficiently accommodate new
domains that appear in-between full model retraining cycles with a rapid
bootstrapping mechanism two orders of magnitude faster than retraining. We
account for practical constraints in real-time production systems, and design
to minimize memory footprint and runtime latency. We demonstrate that
incorporating personalization results in significantly more accurate domain
classification in the setting with thousands of overlapping domains.Comment: Accepted to ACL 201
A Survey on Dialog Management: Recent Advances and Challenges
Dialog management (DM) is a crucial component in a task-oriented dialog
system. Given the dialog history, DM predicts the dialog state and decides the
next action that the dialog agent should take. Recently, dialog policy learning
has been widely formulated as a Reinforcement Learning (RL) problem, and more
works focus on the applicability of DM. In this paper, we survey recent
advances and challenges within three critical topics for DM: (1) improving
model scalability to facilitate dialog system modeling in new scenarios, (2)
dealing with the data scarcity problem for dialog policy learning, and (3)
enhancing the training efficiency to achieve better task-completion performance
. We believe that this survey can shed a light on future research in dialog
management
A Hierarchical Decoding Model For Spoken Language Understanding From Unaligned Data
Spoken language understanding (SLU) systems can be trained on two types of
labelled data: aligned or unaligned. Unaligned data do not require word by word
annotation and is easier to be obtained. In the paper, we focus on spoken
language understanding from unaligned data whose annotation is a set of
act-slot-value triples. Previous works usually focus on improve slot-value pair
prediction and estimate dialogue act types separately, which ignores the
hierarchical structure of the act-slot-value triples. Here, we propose a novel
hierarchical decoding model which dynamically parses act, slot and value in a
structured way and employs pointer network to handle out-of-vocabulary (OOV)
values. Experiments on DSTC2 dataset, a benchmark unaligned dataset, show that
the proposed model not only outperforms previous state-of-the-art model, but
also can be generalized effectively and efficiently to unseen act-slot type
pairs and OOV values.Comment: Accepted by ICASSP 201
Coupled Representation Learning for Domains, Intents and Slots in Spoken Language Understanding
Representation learning is an essential problem in a wide range of
applications and it is important for performing downstream tasks successfully.
In this paper, we propose a new model that learns coupled representations of
domains, intents, and slots by taking advantage of their hierarchical
dependency in a Spoken Language Understanding system. Our proposed model learns
the vector representation of intents based on the slots tied to these intents
by aggregating the representations of the slots. Similarly, the vector
representation of a domain is learned by aggregating the representations of the
intents tied to a specific domain. To the best of our knowledge, it is the
first approach to jointly learning the representations of domains, intents, and
slots using their hierarchical relationships. The experimental results
demonstrate the effectiveness of the representations learned by our model, as
evidenced by improved performance on the contextual cross-domain reranking
task.Comment: IEEE SLT 201
Enhancing Chinese Intent Classification by Dynamically Integrating Character Features into Word Embeddings with Ensemble Techniques
Intent classification has been widely researched on English data with deep
learning approaches that are based on neural networks and word embeddings. The
challenge for Chinese intent classification stems from the fact that, unlike
English where most words are made up of 26 phonologic alphabet letters, Chinese
is logographic, where a Chinese character is a more basic semantic unit that
can be informative and its meaning does not vary too much in contexts. Chinese
word embeddings alone can be inadequate for representing words, and pre-trained
embeddings can suffer from not aligning well with the task at hand. To account
for the inadequacy and leverage Chinese character information, we propose a
low-effort and generic way to dynamically integrate character embedding based
feature maps with word embedding based inputs, whose resulting word-character
embeddings are stacked with a contextual information extraction module to
further incorporate context information for predictions. On top of the proposed
model, we employ an ensemble method to combine single models and obtain the
final result. The approach is data-independent without relying on external
sources like pre-trained word embeddings. The proposed model outperforms
baseline models and existing methods
Parallel Intent and Slot Prediction using MLB Fusion
Intent and Slot Identification are two important tasks in Spoken Language
Understanding (SLU). For a natural language utterance, there is a high
correlation between these two tasks. A lot of work has been done on each of
these using Recurrent-Neural-Networks (RNN), Convolution Neural Networks (CNN)
and Attention based models. Most of the past work used two separate models for
intent and slot prediction. Some of them also used sequence-to-sequence type
models where slots are predicted after evaluating the utterance-level intent.
In this work, we propose a parallel Intent and Slot Prediction technique where
separate Bidirectional Gated Recurrent Units (GRU) are used for each task. We
posit the usage of MLB (Multimodal Low-rank Bilinear Attention Network) fusion
for improvement in performance of intent and slot learning. To the best of our
knowledge, this is the first attempt of using such a technique on text based
problems. Also, our proposed methods outperform the existing state-of-the-art
results for both intent and slot prediction on two benchmark dataset
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection
Recurrent neural network architectures combining with attention mechanism, or
neural attention model, have shown promising performance recently for the tasks
including speech recognition, image caption generation, visual question
answering and machine translation. In this paper, neural attention model is
applied on two sequence classification tasks, dialogue act detection and key
term extraction. In the sequence labeling tasks, the model input is a sequence,
and the output is the label of the input sequence. The major difficulty of
sequence labeling is that when the input sequence is long, it can include many
noisy or irrelevant part. If the information in the whole sequence is treated
equally, the noisy or irrelevant part may degrade the classification
performance. The attention mechanism is helpful for sequence classification
task because it is capable of highlighting important part among the entire
sequence for the classification task. The experimental results show that with
the attention mechanism, discernible improvements were achieved in the sequence
labeling task considered here. The roles of the attention mechanism in the
tasks are further analyzed and visualized in this paper.Comment: 5 pages, 2 figure
- …