9,094 research outputs found
A Survey on Dialogue Systems: Recent Advances and New Frontiers
Dialogue systems have attracted more and more attention. Recent advances on
dialogue systems are overwhelmingly contributed by deep learning techniques,
which have been employed to enhance a wide range of big data applications such
as computer vision, natural language processing, and recommender systems. For
dialogue systems, deep learning can leverage a massive amount of data to learn
meaningful feature representations and response generation strategies, while
requiring a minimum amount of hand-crafting. In this article, we give an
overview to these recent advances on dialogue systems from various perspectives
and discuss some possible research directions. In particular, we generally
divide existing dialogue systems into task-oriented and non-task-oriented
models, then detail how deep learning techniques help them with representative
algorithms and finally discuss some appealing research directions that can
bring the dialogue system research into a new frontier.Comment: 13 pages. arXiv admin note: text overlap with arXiv:1703.01008 by
other author
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
This paper describes the development of Microsoft XiaoIce, the most popular
social chatbot in the world. XiaoIce is uniquely designed as an AI companion
with an emotional connection to satisfy the human need for communication,
affection, and social belonging. We take into account both intelligent quotient
(IQ) and emotional quotient (EQ) in system design, cast human-machine social
chat as decision-making over Markov Decision Processes (MDPs), and optimize
XiaoIce for long-term user engagement, measured in expected Conversation-turns
Per Session (CPS). We detail the system architecture and key components
including dialogue manager, core chat, skills, and an empathetic computing
module. We show how XiaoIce dynamically recognizes human feelings and states,
understands user intent, and responds to user needs throughout long
conversations. Since her launch in 2014, XiaoIce has communicated with over 660
million active users and succeeded in establishing long-term relationships with
many of them. Analysis of large scale online logs shows that XiaoIce has
achieved an average CPS of 23, which is significantly higher than that of other
chatbots and even human conversations
Neural Matching Models for Question Retrieval and Next Question Prediction in Conversation
The recent boom of AI has seen the emergence of many human-computer
conversation systems such as Google Assistant, Microsoft Cortana, Amazon Echo
and Apple Siri. We introduce and formalize the task of predicting questions in
conversations, where the goal is to predict the new question that the user will
ask, given the past conversational context. This task can be modeled as a
"sequence matching" problem, where two sequences are given and the aim is to
learn a model that maps any pair of sequences to a matching probability. Neural
matching models, which adopt deep neural networks to learn sequence
representations and matching scores, have attracted immense research interests
of information retrieval and natural language processing communities. In this
paper, we first study neural matching models for the question retrieval task
that has been widely explored in the literature, whereas the effectiveness of
neural models for this task is relatively unstudied. We further evaluate the
neural matching models in the next question prediction task in conversations.
We have used the publicly available Quora data and Ubuntu chat logs in our
experiments. Our evaluations investigate the potential of neural matching
models with representation learning for question retrieval and next question
prediction in conversations. Experimental results show that neural matching
models perform well for both tasks.Comment: Neu-IR 2017: The SIGIR 2017 Workshop on Neural Information Retrieval
(SIGIR Neu-IR 2017), Tokyo, Japan, August 7-11, 201
A Hybrid Retrieval-Generation Neural Conversation Model
Intelligent personal assistant systems that are able to have multi-turn
conversations with human users are becoming increasingly popular. Most previous
research has been focused on using either retrieval-based or generation-based
methods to develop such systems. Retrieval-based methods have the advantage of
returning fluent and informative responses with great diversity. However, the
performance of the methods is limited by the size of the response repository.
On the other hand, generation-based methods can produce highly coherent
responses on any topics. But the generated responses are often generic and not
informative due to the lack of grounding knowledge. In this paper, we propose a
hybrid neural conversation model that combines the merits of both response
retrieval and generation methods. Experimental results on Twitter and
Foursquare data show that the proposed model outperforms both retrieval-based
methods and generation-based methods (including a recently proposed
knowledge-grounded neural conversation model) under both automatic evaluation
metrics and human evaluation. We hope that the findings in this study provide
new insights on how to integrate text retrieval and text generation models for
building conversation systems.Comment: Accepted as a Full Paper in CIKM 2019. 10 page
Modeling Multi-turn Conversation with Deep Utterance Aggregation
Multi-turn conversation understanding is a major challenge for building
intelligent dialogue systems. This work focuses on retrieval-based response
matching for multi-turn conversation whose related work simply concatenates the
conversation utterances, ignoring the interactions among previous utterances
for context modeling. In this paper, we formulate previous utterances into
context using a proposed deep utterance aggregation model to form a
fine-grained context representation. In detail, a self-matching attention is
first introduced to route the vital information in each utterance. Then the
model matches a response with each refined utterance and the final matching
score is obtained after attentive turns aggregation. Experimental results show
our model outperforms the state-of-the-art methods on three multi-turn
conversation benchmarks, including a newly introduced e-commerce dialogue
corpus.Comment: Proceedings of the 27th International Conference on Computational
Linguistics (COLING 2018
Neural Approaches to Conversational AI
The present paper surveys neural approaches to conversational AI that have
been developed in the last few years. We group conversational systems into
three categories: (1) question answering agents, (2) task-oriented dialogue
agents, and (3) chatbots. For each category, we present a review of
state-of-the-art neural approaches, draw the connection between them and
traditional approaches, and discuss the progress that has been made and
challenges still being faced, using specific systems and models as case
studies.Comment: Foundations and Trends in Information Retrieval (95 pages
AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
In the last two decades, the landscape of text generation has undergone
tremendous changes and is being reshaped by the success of deep learning. New
technologies for text generation ranging from template-based methods to neural
network-based methods emerged. Meanwhile, the research objectives have also
changed from generating smooth and coherent sentences to infusing personalized
traits to enrich the diversification of newly generated content. With the rapid
development of text generation solutions, one comprehensive survey is urgent to
summarize the achievements and track the state of the arts. In this survey
paper, we present the general systematical framework, illustrate the widely
utilized models and summarize the classic applications of text generation.Comment: Accepted by IEEE UIC 201
A Sequential Matching Framework for Multi-turn Response Selection in Retrieval-based Chatbots
We study the problem of response selection for multi-turn conversation in
retrieval-based chatbots. The task requires matching a response candidate with
a conversation context, whose challenges include how to recognize important
parts of the context, and how to model the relationships among utterances in
the context. Existing matching methods may lose important information in
contexts as we can interpret them with a unified framework in which contexts
are transformed to fixed-length vectors without any interaction with responses
before matching. The analysis motivates us to propose a new matching framework
that can sufficiently carry the important information in contexts to matching
and model the relationships among utterances at the same time. The new
framework, which we call a sequential matching framework (SMF), lets each
utterance in a context interacts with a response candidate at the first step
and transforms the pair to a matching vector. The matching vectors are then
accumulated following the order of the utterances in the context with a
recurrent neural network (RNN) which models the relationships among the
utterances. The context-response matching is finally calculated with the hidden
states of the RNN. Under SMF, we propose a sequential convolutional network and
sequential attention network and conduct experiments on two public data sets to
test their performance. Experimental results show that both models can
significantly outperform the state-of-the-art matching methods. We also show
that the models are interpretable with visualizations that provide us insights
on how they capture and leverage the important information in contexts for
matching.Comment: Submitted to Computational Linguistic
Improving Retrieval Modeling Using Cross Convolution Networks And Multi Frequency Word Embedding
To build a satisfying chatbot that has the ability of managing a
goal-oriented multi-turn dialogue, accurate modeling of human conversation is
crucial. In this paper we concentrate on the task of response selection for
multi-turn human-computer conversation with a given context. Previous
approaches show weakness in capturing information of rare keywords that appear
in either or both context and correct response, and struggle with long input
sequences. We propose Cross Convolution Network (CCN) and Multi Frequency word
embedding to address both problems. We train several models using the Ubuntu
Dialogue dataset which is the largest freely available multi-turn based
dialogue corpus. We further build an ensemble model by averaging predictions of
multiple models. We achieve a new state-of-the-art on this dataset with
considerable improvements compared to previous best results
RubyStar: A Non-Task-Oriented Mixture Model Dialog System
RubyStar is a dialog system designed to create "human-like" conversation by
combining different response generation strategies. RubyStar conducts a
non-task-oriented conversation on general topics by using an ensemble of
rule-based, retrieval-based and generative methods. Topic detection, engagement
monitoring, and context tracking are used for managing interaction. Predictable
elements of conversation, such as the bot's backstory and simple question
answering are handled by separate modules. We describe a rating scheme we
developed for evaluating response generation. We find that character-level RNN
is an effective generation model for general responses, with proper parameter
settings; however other kinds of conversation topics might benefit from using
other models
- …