7,765 research outputs found
End-to-end optimization of goal-driven and visually grounded dialogue systems
End-to-end design of dialogue systems has recently become a popular research
topic thanks to powerful tools such as encoder-decoder architectures for
sequence-to-sequence learning. Yet, most current approaches cast human-machine
dialogue management as a supervised learning problem, aiming at predicting the
next utterance of a participant given the full history of the dialogue. This
vision is too simplistic to render the intrinsic planning problem inherent to
dialogue as well as its grounded nature, making the context of a dialogue
larger than the sole history. This is why only chit-chat and question answering
tasks have been addressed so far using end-to-end architectures. In this paper,
we introduce a Deep Reinforcement Learning method to optimize visually grounded
task-oriented dialogues, based on the policy gradient algorithm. This approach
is tested on a dataset of 120k dialogues collected through Mechanical Turk and
provides encouraging results at solving both the problem of generating natural
dialogues and the task of discovering a specific object in a complex picture
Individual and Domain Adaptation in Sentence Planning for Dialogue
One of the biggest challenges in the development and deployment of spoken
dialogue systems is the design of the spoken language generation module. This
challenge arises from the need for the generator to adapt to many features of
the dialogue domain, user population, and dialogue context. A promising
approach is trainable generation, which uses general-purpose linguistic
knowledge that is automatically adapted to the features of interest, such as
the application domain, individual user, or user group. In this paper we
present and evaluate a trainable sentence planner for providing restaurant
information in the MATCH dialogue system. We show that trainable sentence
planning can produce complex information presentations whose quality is
comparable to the output of a template-based generator tuned to this domain. We
also show that our method easily supports adapting the sentence planner to
individuals, and that the individualized sentence planners generally perform
better than models trained and tested on a population of individuals. Previous
work has documented and utilized individual preferences for content selection,
but to our knowledge, these results provide the first demonstration of
individual preferences for sentence planning operations, affecting the content
order, discourse structure and sentence structure of system responses. Finally,
we evaluate the contribution of different feature sets, and show that, in our
application, n-gram features often do as well as features based on higher-level
linguistic representations
Incremental LSTM-based Dialog State Tracker
A dialog state tracker is an important component in modern spoken dialog
systems. We present an incremental dialog state tracker, based on LSTM
networks. It directly uses automatic speech recognition hypotheses to track the
state. We also present the key non-standard aspects of the model that bring its
performance close to the state-of-the-art and experimentally analyze their
contribution: including the ASR confidence scores, abstracting scarcely
represented values, including transcriptions in the training data, and model
averaging
- …