4,516 research outputs found
Global-to-local Memory Pointer Networks for Task-Oriented Dialogue
End-to-end task-oriented dialogue is challenging since knowledge bases are
usually large, dynamic and hard to incorporate into a learning framework. We
propose the global-to-local memory pointer (GLMP) networks to address this
issue. In our model, a global memory encoder and a local memory decoder are
proposed to share external knowledge. The encoder encodes dialogue history,
modifies global contextual representation, and generates a global memory
pointer. The decoder first generates a sketch response with unfilled slots.
Next, it passes the global memory pointer to filter the external knowledge for
relevant information, then instantiates the slots via the local memory
pointers. We empirically show that our model can improve copy accuracy and
mitigate the common out-of-vocabulary problem. As a result, GLMP is able to
improve over the previous state-of-the-art models in both simulated bAbI
Dialogue dataset and human-human Stanford Multi-domain Dialogue dataset on
automatic and human evaluation.Comment: ICLR 201
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems
End-to-end task-oriented dialog systems usually suffer from the challenge of
incorporating knowledge bases. In this paper, we propose a novel yet simple
end-to-end differentiable model called memory-to-sequence (Mem2Seq) to address
this issue. Mem2Seq is the first neural generative model that combines the
multi-hop attention over memories with the idea of pointer network. We
empirically show how Mem2Seq controls each generation step, and how its
multi-hop attention mechanism helps in learning correlations between memories.
In addition, our model is quite general without complicated task-specific
designs. As a result, we show that Mem2Seq can be trained faster and attain the
state-of-the-art performance on three different task-oriented dialog datasets.Comment: Accepted by the Association for Computational Linguistics (ACL) 2018.
Andrea Madotto* and Chien-Sheng Wu* contributed equally to this wor
Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability
Generative encoder-decoder models offer great promise in developing
domain-general dialog systems. However, they have mainly been applied to
open-domain conversations. This paper presents a practical and novel framework
for building task-oriented dialog systems based on encoder-decoder models. This
framework enables encoder-decoder models to accomplish slot-value independent
decision-making and interact with external databases. Moreover, this paper
shows the flexibility of the proposed method by interleaving chatting
capability with a slot-filling system for better out-of-domain recovery. The
models were trained on both real-user data from a bus information system and
human-human chat data. Results show that the proposed framework achieves good
performance in both offline evaluation metrics and in task success rate with
human users.Comment: Accepted as a long paper in SIGIDIAL 201
An End-to-End Goal-Oriented Dialog System with a Generative Natural Language Response Generation
Recently advancements in deep learning allowed the development of end-to-end
trained goal-oriented dialog systems. Although these systems already achieve
good performance, some simplifications limit their usage in real-life
scenarios.
In this work, we address two of these limitations: ignoring positional
information and a fixed number of possible response candidates. We propose to
use positional encodings in the input to model the word order of the user
utterances. Furthermore, by using a feedforward neural network, we are able to
generate the output word by word and are no longer restricted to a fixed number
of possible response candidates. Using the positional encoding, we were able to
achieve better accuracies in the Dialog bAbI Tasks and using the feedforward
neural network for generating the response, we were able to save computation
time and space consumption.Comment: 11 pages, 4 figures, forthcoming in IWSDS 2018; added quantitative
analysis of sensitivity to modified user utterances and minor improvement
Learning Personalized End-to-End Goal-Oriented Dialog
Most existing works on dialog systems only consider conversation content
while neglecting the personality of the user the bot is interacting with, which
begets several unsolved issues. In this paper, we present a personalized
end-to-end model in an attempt to leverage personalization in goal-oriented
dialogs. We first introduce a Profile Model which encodes user profiles into
distributed embeddings and refers to conversation history from other similar
users. Then a Preference Model captures user preferences over knowledge base
entities to handle the ambiguity in user requests. The two models are combined
into the Personalized MemN2N. Experiments show that the proposed model achieves
qualitative performance improvements over state-of-the-art methods. As for
human evaluation, it also outperforms other approaches in terms of task
completion rate and user satisfaction.Comment: Accepted by AAAI 201
Neural Approaches to Conversational AI
The present paper surveys neural approaches to conversational AI that have
been developed in the last few years. We group conversational systems into
three categories: (1) question answering agents, (2) task-oriented dialogue
agents, and (3) chatbots. For each category, we present a review of
state-of-the-art neural approaches, draw the connection between them and
traditional approaches, and discuss the progress that has been made and
challenges still being faced, using specific systems and models as case
studies.Comment: Foundations and Trends in Information Retrieval (95 pages
Learning to Memorize in Neural Task-Oriented Dialogue Systems
In this thesis, we leverage the neural copy mechanism and memory-augmented
neural networks (MANNs) to address existing challenge of neural task-oriented
dialogue learning. We show the effectiveness of our strategy by achieving good
performance in multi-domain dialogue state tracking, retrieval-based dialogue
systems, and generation-based dialogue systems. We first propose a transferable
dialogue state generator (TRADE) that leverages its copy mechanism to get rid
of dialogue ontology and share knowledge between domains. We also evaluate
unseen domain dialogue state tracking and show that TRADE enables zero-shot
dialogue state tracking and can adapt to new few-shot domains without
forgetting the previous domains. Second, we utilize MANNs to improve
retrieval-based dialogue learning. They are able to capture dialogue sequential
dependencies and memorize long-term information. We also propose a recorded
delexicalization copy strategy to replace real entity values with ordered
entity types. Our models are shown to surpass other retrieval baselines,
especially when the conversation has a large number of turns. Lastly, we tackle
generation-based dialogue learning with two proposed models, the
memory-to-sequence (Mem2Seq) and global-to-local memory pointer network (GLMP).
Mem2Seq is the first model to combine multi-hop memory attention with the idea
of the copy mechanism. GLMP further introduces the concept of response
sketching and double pointers copying. We show that GLMP achieves the
state-of-the-art performance on human evaluation.Comment: HKUST MPhil Thesis. 93 page
SEntNet: Source-aware Recurrent Entity Network for Dialogue Response Selection
Dialogue response selection is an important part of Task-oriented Dialogue
Systems (TDSs); it aims to predict an appropriate response given a dialogue
context. Obtaining key information from a complex, long dialogue context is
challenging, especially when different sources of information are available,
e.g., the user's utterances, the system's responses, and results retrieved from
a knowledge base (KB). Previous work ignores the type of information source and
merges sources for response selection. However, accounting for the source type
may lead to remarkable differences in the quality of response selection. We
propose the Source-aware Recurrent Entity Network (SEntNet), which is aware of
different information sources for the response selection process. SEntNet
achieves this by employing source-specific memories to exploit differences in
the usage of words and syntactic structure from different information sources
(user, system, and KB). Experimental results show that SEntNet obtains 91.0%
accuracy on the Dialog bAbI dataset, outperforming prior work by 4.7%. On the
DSTC2 dataset, SEntNet obtains an accuracy of 41.2%, beating source unaware
recurrent entity networks by 2.4%.Comment: Proceedings of the 2019 IJCAI Workshop SCAI: The 4th International
Workshop on Search-Oriented Conversational A
The Rapidly Changing Landscape of Conversational Agents
Conversational agents have become ubiquitous, ranging from goal-oriented
systems for helping with reservations to chit-chat models found in modern
virtual assistants. In this survey paper, we explore this fascinating field. We
look at some of the pioneering work that defined the field and gradually move
to the current state-of-the-art models. We look at statistical, neural,
generative adversarial network based and reinforcement learning based
approaches and how they evolved. Along the way we discuss various challenges
that the field faces, lack of context in utterances, not having a good
quantitative metric to compare models, lack of trust in agents because they do
not have a consistent persona etc. We structure this paper in a way that
answers these pertinent questions and discusses competing approaches to solve
them.Comment: 14 pages, 7 figures. arXiv admin note: text overlap with
arXiv:1704.07130, arXiv:1507.04808, arXiv:1603.06155, arXiv:1611.06997,
arXiv:1704.08966 by other author
Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity
Open-ended human learning and information-seeking are increasingly mediated
by digital assistants. However, such systems often ignore the user's
pre-existing knowledge. Assuming a correlation between engagement and user
responses such as "liking" messages or asking followup questions, we design a
Wizard-of-Oz dialog task that tests the hypothesis that engagement increases
when users are presented with facts related to what they know. Through
crowd-sourcing of this experiment, we collect and release 14K dialogs (181K
utterances) where users and assistants converse about geographic topics like
geopolitical entities and locations. This dataset is annotated with
pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia,
and user reactions to messages. Responses using a user's prior knowledge
increase engagement. We incorporate this knowledge into a multi-task model that
reproduces human assistant policies and improves over a BERT content model by
13 mean reciprocal rank points.Comment: EMNLP 2020: https://www.aclweb.org/anthology/2020.emnlp-main.655
- …