44,378 research outputs found
The Rapidly Changing Landscape of Conversational Agents
Conversational agents have become ubiquitous, ranging from goal-oriented
systems for helping with reservations to chit-chat models found in modern
virtual assistants. In this survey paper, we explore this fascinating field. We
look at some of the pioneering work that defined the field and gradually move
to the current state-of-the-art models. We look at statistical, neural,
generative adversarial network based and reinforcement learning based
approaches and how they evolved. Along the way we discuss various challenges
that the field faces, lack of context in utterances, not having a good
quantitative metric to compare models, lack of trust in agents because they do
not have a consistent persona etc. We structure this paper in a way that
answers these pertinent questions and discusses competing approaches to solve
them.Comment: 14 pages, 7 figures. arXiv admin note: text overlap with
arXiv:1704.07130, arXiv:1507.04808, arXiv:1603.06155, arXiv:1611.06997,
arXiv:1704.08966 by other author
Investigation of Language Understanding Impact for Reinforcement Learning Based Dialogue Systems
Language understanding is a key component in a spoken dialogue system. In
this paper, we investigate how the language understanding module influences the
dialogue system performance by conducting a series of systematic experiments on
a task-oriented neural dialogue system in a reinforcement learning based
setting. The empirical study shows that among different types of language
understanding errors, slot-level errors can have more impact on the overall
performance of a dialogue system compared to intent-level errors. In addition,
our experiments demonstrate that the reinforcement learning based dialogue
system is able to learn when and what to confirm in order to achieve better
performance and greater robustness.Comment: 5 pages, 5 figure
Learning Goal-Oriented Visual Dialog via Tempered Policy Gradient
Learning goal-oriented dialogues by means of deep reinforcement learning has
recently become a popular research topic. However, commonly used policy-based
dialogue agents often end up focusing on simple utterances and suboptimal
policies. To mitigate this problem, we propose a class of novel
temperature-based extensions for policy gradient methods, which are referred to
as Tempered Policy Gradients (TPGs). On a recent AI-testbed, i.e., the
GuessWhat?! game, we achieve significant improvements with two innovations. The
first one is an extension of the state-of-the-art solutions with Seq2Seq and
Memory Network structures that leads to an improvement of 7%. The second one is
the application of our newly developed TPG methods, which improves the
performance additionally by around 5% and, even more importantly, helps produce
more convincing utterances.Comment: Published in IEEE Spoken Language Technology (SLT 2018), Athens,
Greec
Neural Approaches to Conversational AI
The present paper surveys neural approaches to conversational AI that have
been developed in the last few years. We group conversational systems into
three categories: (1) question answering agents, (2) task-oriented dialogue
agents, and (3) chatbots. For each category, we present a review of
state-of-the-art neural approaches, draw the connection between them and
traditional approaches, and discuss the progress that has been made and
challenges still being faced, using specific systems and models as case
studies.Comment: Foundations and Trends in Information Retrieval (95 pages
Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning
This paper presents a new method --- adversarial advantage actor-critic
(Adversarial A2C), which significantly improves the efficiency of dialogue
policy learning in task-completion dialogue systems. Inspired by generative
adversarial networks (GAN), we train a discriminator to differentiate
responses/actions generated by dialogue agents from responses/actions by
experts. Then, we incorporate the discriminator as another critic into the
advantage actor-critic (A2C) framework, to encourage the dialogue agent to
explore state-action within the regions where the agent takes actions similar
to those of the experts. Experimental results in a movie-ticket booking domain
show that the proposed Adversarial A2C can accelerate policy exploration
efficiently.Comment: 5 pages, 3 figures, ICASSP 201
Decoupling Strategy and Generation in Negotiation Dialogues
We consider negotiation settings in which two agents use natural language to
bargain on goods. Agents need to decide on both high-level strategy (e.g.,
proposing \$50) and the execution of that strategy (e.g., generating "The bike
is brand new. Selling for just \$50."). Recent work on negotiation trains
neural models, but their end-to-end nature makes it hard to control their
strategy, and reinforcement learning tends to lead to degenerate solutions. In
this paper, we propose a modular approach based on coarse di- alogue acts
(e.g., propose(price=50)) that decouples strategy and generation. We show that
we can flexibly set the strategy using supervised learning, reinforcement
learning, or domain-specific knowledge without degeneracy, while our
retrieval-based generation can maintain context-awareness and produce diverse
utterances. We test our approach on the recently proposed DEALORNODEAL game,
and we also collect a richer dataset based on real items on Craigslist. Human
evaluation shows that our systems achieve higher task success rate and more
human-like negotiation behavior than previous approaches.Comment: EMNLP 201
Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning
This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving
the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed
framework that extends the Dyna-Q algorithm to integrate planning for
task-completion dialogue policy learning. To obviate DDQ's high dependency on
the quality of simulated experiences, we incorporate an RNN-based discriminator
in D3Q to differentiate simulated experience from real user experience in order
to control the quality of training data. Experiments show that D3Q
significantly outperforms DDQ by controlling the quality of simulated
experience used for planning. The effectiveness and robustness of D3Q is
further demonstrated in a domain extension setting, where the agent's
capability of adapting to a changing environment is tested.Comment: 11 pages, 10 figures, EMNLP 2018 long pape
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Building a dialogue agent to fulfill complex tasks, such as travel planning,
is challenging because the agent has to learn to collectively complete multiple
subtasks. For example, the agent needs to reserve a hotel and book a flight so
that there leaves enough time for commute between arrival and hotel check-in.
This paper addresses this challenge by formulating the task in the mathematical
framework of options over Markov Decision Processes (MDPs), and proposing a
hierarchical deep reinforcement learning approach to learning a dialogue
manager that operates at different temporal scales. The dialogue manager
consists of: (1) a top-level dialogue policy that selects among subtasks or
options, (2) a low-level dialogue policy that selects primitive actions to
complete the subtask given by the top-level policy, and (3) a global state
tracker that helps ensure all cross-subtask constraints be satisfied.
Experiments on a travel planning task with simulated and real users show that
our approach leads to significant improvements over three baselines, two based
on handcrafted rules and the other based on flat deep reinforcement learning.Comment: 12 pages, 8 figure
End-to-End Task-Completion Neural Dialogue Systems
One of the major drawbacks of modularized task-completion dialogue systems is
that each module is trained individually, which presents several challenges.
For example, downstream modules are affected by earlier modules, and the
performance of the entire system is not robust to the accumulated errors. This
paper presents a novel end-to-end learning framework for task-completion
dialogue systems to tackle such issues. Our neural dialogue system can directly
interact with a structured database to assist users in accessing information
and accomplishing certain tasks. The reinforcement learning based dialogue
manager offers robust capabilities to handle noises caused by other components
of the dialogue system. Our experiments in a movie-ticket booking domain show
that our end-to-end system not only outperforms modularized dialogue system
baselines for both objective and subjective evaluation, but also is robust to
noises as demonstrated by several systematic experiments with different error
granularity and rates specific to the language understanding module.Comment: 11 pages, IJCNLP 2017, arXiv admin note: substantial text overlap
with arXiv:1703.0705
Chat More If You Like: Dynamic Cue Words Planning to Flow Longer Conversations
To build an open-domain multi-turn conversation system is one of the most
interesting and challenging tasks in Artificial Intelligence. Many research
efforts have been dedicated to building such dialogue systems, yet few shed
light on modeling the conversation flow in an ongoing dialogue. Besides, it is
common for people to talk about highly relevant aspects during a conversation.
And the topics are coherent and drift naturally, which demonstrates the
necessity of dialogue flow modeling. To this end, we present the multi-turn
cue-words driven conversation system with reinforcement learning method (RLCw),
which strives to select an adaptive cue word with the greatest future credit,
and therefore improve the quality of generated responses. We introduce a new
reward to measure the quality of cue words in terms of effectiveness and
relevance. To further optimize the model for long-term conversations, a
reinforcement approach is adopted in this paper. Experiments on real-life
dataset demonstrate that our model consistently outperforms a set of
competitive baselines in terms of simulated turns, diversity and human
evaluation
- …