10,748 research outputs found
Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems
Dialog response ranking is used to rank response candidates by considering
their relation to the dialog history. Although researchers have addressed this
concept for open-domain dialogs, little attention has been focused on
task-oriented dialogs. Furthermore, no previous studies have analyzed whether
response ranking can improve the performance of existing dialog systems in real
human-computer dialogs with speech recognition errors. In this paper, we
propose a context-aware dialog response re-ranking system. Our system reranks
responses in two steps: (1) it calculates matching scores for each candidate
response and the current dialog context; (2) it combines the matching scores
and a probability distribution of the candidates from an existing dialog system
for response re-ranking. By using neural word embedding-based models and
handcrafted or logistic regression-based ensemble models, we have improved the
performance of a recently proposed end-to-end task-oriented dialog system on
real dialogs with speech recognition errors.Comment: Accepted in IEEE SLT 2018. 8 pages, 3 figure
ConveRT: Efficient and Accurate Conversational Representations from Transformers
General-purpose pretrained sentence encoders such as BERT are not ideal for
real-world conversational AI applications; they are computationally heavy,
slow, and expensive to train. We propose ConveRT (Conversational
Representations from Transformers), a pretraining framework for conversational
tasks satisfying all the following requirements: it is effective, affordable,
and quick to train. We pretrain using a retrieval-based response selection
task, effectively leveraging quantization and subword-level parameterization in
the dual encoder to build a lightweight memory- and energy-efficient model. We
show that ConveRT achieves state-of-the-art performance across widely
established response selection tasks. We also demonstrate that the use of
extended dialog history as context yields further performance gains. Finally,
we show that pretrained representations from the proposed encoder can be
transferred to the intent classification task, yielding strong results across
three diverse data sets. ConveRT trains substantially faster than standard
sentence encoders or previous state-of-the-art dual encoders. With its reduced
size and superior performance, we believe this model promises wider portability
and scalability for Conversational AI applications
A Survey on Dialogue Systems: Recent Advances and New Frontiers
Dialogue systems have attracted more and more attention. Recent advances on
dialogue systems are overwhelmingly contributed by deep learning techniques,
which have been employed to enhance a wide range of big data applications such
as computer vision, natural language processing, and recommender systems. For
dialogue systems, deep learning can leverage a massive amount of data to learn
meaningful feature representations and response generation strategies, while
requiring a minimum amount of hand-crafting. In this article, we give an
overview to these recent advances on dialogue systems from various perspectives
and discuss some possible research directions. In particular, we generally
divide existing dialogue systems into task-oriented and non-task-oriented
models, then detail how deep learning techniques help them with representative
algorithms and finally discuss some appealing research directions that can
bring the dialogue system research into a new frontier.Comment: 13 pages. arXiv admin note: text overlap with arXiv:1703.01008 by
other author
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
This paper presents a new model for visual dialog, Recurrent Dual Attention
Network (ReDAN), using multi-step reasoning to answer a series of questions
about an image. In each question-answering turn of a dialog, ReDAN infers the
answer progressively through multiple reasoning steps. In each step of the
reasoning process, the semantic representation of the question is updated based
on the image and the previous dialog history, and the recurrently-refined
representation is used for further reasoning in the subsequent step. On the
VisDial v1.0 dataset, the proposed ReDAN model achieves a new state-of-the-art
of 64.47% NDCG score. Visualization on the reasoning process further
demonstrates that ReDAN can locate context-relevant visual and textual clues
via iterative refinement, which can lead to the correct answer step-by-step.Comment: Accepted to ACL 201
DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
We develop a high-quality multi-turn dialog dataset, DailyDialog, which is
intriguing in several aspects. The language is human-written and less noisy.
The dialogues in the dataset reflect our daily communication way and cover
various topics about our daily life. We also manually label the developed
dataset with communication intention and emotion information. Then, we evaluate
existing approaches on DailyDialog dataset and hope it benefit the research
field of dialog systems.Comment: accepted by IJCNLP 201
Learning Personalized End-to-End Goal-Oriented Dialog
Most existing works on dialog systems only consider conversation content
while neglecting the personality of the user the bot is interacting with, which
begets several unsolved issues. In this paper, we present a personalized
end-to-end model in an attempt to leverage personalization in goal-oriented
dialogs. We first introduce a Profile Model which encodes user profiles into
distributed embeddings and refers to conversation history from other similar
users. Then a Preference Model captures user preferences over knowledge base
entities to handle the ambiguity in user requests. The two models are combined
into the Personalized MemN2N. Experiments show that the proposed model achieves
qualitative performance improvements over state-of-the-art methods. As for
human evaluation, it also outperforms other approaches in terms of task
completion rate and user satisfaction.Comment: Accepted by AAAI 201
Incorporating Loose-Structured Knowledge into Conversation Modeling via Recall-Gate LSTM
Modeling human conversations is the essence for building satisfying chat-bots
with multi-turn dialog ability. Conversation modeling will notably benefit from
domain knowledge since the relationships between sentences can be clarified due
to semantic hints introduced by knowledge. In this paper, a deep neural network
is proposed to incorporate background knowledge for conversation modeling.
Through a specially designed Recall gate, domain knowledge can be transformed
into the extra global memory of Long Short-Term Memory (LSTM), so as to enhance
LSTM by cooperating with its local memory to capture the implicit semantic
relevance between sentences within conversations. In addition, this paper
introduces the loose structured domain knowledge base, which can be built with
slight amount of manual work and easily adopted by the Recall gate. Our model
is evaluated on the context-oriented response selecting task, and experimental
results on both two datasets have shown that our approach is promising for
modeling human conversations and building key components of automatic chatting
systems.Comment: under review of IJCNN 2017; 10 pages, 5 figure
A Survey of Document Grounded Dialogue Systems (DGDS)
Dialogue system (DS) attracts great attention from industry and academia
because of its wide application prospects. Researchers usually divide the DS
according to the function. However, many conversations require the DS to switch
between different functions. For example, movie discussion can change from
chit-chat to QA, the conversational recommendation can transform from chit-chat
to recommendation, etc. Therefore, classification according to functions may
not be enough to help us appreciate the current development trend. We classify
the DS based on background knowledge. Specifically, study the latest DS based
on the unstructured document(s). We define Document Grounded Dialogue System
(DGDS) as the DS that the dialogues are centering on the given document(s). The
DGDS can be used in scenarios such as talking over merchandise against product
Manual, commenting on news reports, etc. We believe that extracting
unstructured document(s) information is the future trend of the DS because a
great amount of human knowledge lies in these document(s). The research of the
DGDS not only possesses a broad application prospect but also facilitates AI to
better understand human knowledge and natural language. We analyze the
classification, architecture, datasets, models, and future development trends
of the DGDS, hoping to help researchers in this field.Comment: 30 pages, 4 figures, 13 table
Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators
Encoder-decoder based neural architectures serve as the basis of
state-of-the-art approaches in end-to-end open domain dialog systems. Since
most of such systems are trained with a maximum likelihood~(MLE) objective they
suffer from issues such as lack of generalizability and the generic response
problem, i.e., a system response that can be an answer to a large number of
user utterances, e.g., "Maybe, I don't know." Having explicit feedback on the
relevance and interestingness of a system response at each turn can be a useful
signal for mitigating such issues and improving system quality by selecting
responses from different approaches. Towards this goal, we present a system
that evaluates chatbot responses at each dialog turn for coherence and
engagement. Our system provides explicit turn-level dialog quality feedback,
which we show to be highly correlated with human evaluation. To show that
incorporating this feedback in the neural response generation models improves
dialog quality, we present two different and complementary mechanisms to
incorporate explicit feedback into a neural response generation model:
reranking and direct modification of the loss function during training. Our
studies show that a response generation model that incorporates these combined
feedback mechanisms produce more engaging and coherent responses in an
open-domain spoken dialog setting, significantly improving the response quality
using both automatic and human evaluation
Dialog-based Interactive Image Retrieval
Existing methods for interactive image retrieval have demonstrated the merit
of integrating user feedback, improving retrieval results. However, most
current systems rely on restricted forms of user feedback, such as binary
relevance responses, or feedback based on a fixed set of relative attributes,
which limits their impact. In this paper, we introduce a new approach to
interactive image search that enables users to provide feedback via natural
language, allowing for more natural and effective interaction. We formulate the
task of dialog-based interactive image retrieval as a reinforcement learning
problem, and reward the dialog system for improving the rank of the target
image during each dialog turn. To mitigate the cumbersome and costly process of
collecting human-machine conversations as the dialog system learns, we train
our system with a user simulator, which is itself trained to describe the
differences between target and candidate images. The efficacy of our approach
is demonstrated in a footwear retrieval application. Experiments on both
simulated and real-world data show that 1) our proposed learning framework
achieves better accuracy than other supervised and reinforcement learning
baselines and 2) user feedback based on natural language rather than
pre-specified attributes leads to more effective retrieval results, and a more
natural and expressive communication interface.Comment: accepted at NeurIPS 201
- …