80,500 research outputs found
Deep Active Learning for Dialogue Generation
We propose an online, end-to-end, neural generative conversational model for
open-domain dialogue. It is trained using a unique combination of offline
two-phase supervised learning and online human-in-the-loop active learning.
While most existing research proposes offline supervision or hand-crafted
reward functions for online reinforcement, we devise a novel interactive
learning mechanism based on hamming-diverse beam search for response generation
and one-character user-feedback at each step. Experiments show that our model
inherently promotes the generation of semantically relevant and interesting
responses, and can be used to train agents with customized personas, moods and
conversational styles.Comment: Accepted at 6th Joint Conference on Lexical and Computational
Semantics (*SEM) 2017 (Previously titled "Online Sequence-to-Sequence Active
Learning for Open-Domain Dialogue Generation" on ArXiv
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
Training a task-completion dialogue agent via reinforcement learning (RL) is
costly because it requires many interactions with real users. One common
alternative is to use a user simulator. However, a user simulator usually lacks
the language complexity of human interlocutors and the biases in its design may
tend to degrade the agent. To address these issues, we present Deep Dyna-Q,
which to our knowledge is the first deep RL framework that integrates planning
for task-completion dialogue policy learning. We incorporate into the dialogue
agent a model of the environment, referred to as the world model, to mimic real
user response and generate simulated experience. During dialogue policy
learning, the world model is constantly updated with real user experience to
approach real user behavior, and in turn, the dialogue agent is optimized using
both real experience and simulated experience. The effectiveness of our
approach is demonstrated on a movie-ticket booking task in both simulated and
human-in-the-loop settings.Comment: 11 pages, 8 figures, Accepted in ACL 201
Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach
Medical conversations between patients and medical professionals have
implicit functional sections, such as "history taking", "summarization",
"education", and "care plan." In this work, we are interested in learning to
automatically extract these sections. A direct approach would require
collecting large amounts of expert annotations for this task, which is
inherently costly due to the contextual inter-and-intra variability between
these sections. This paper presents an approach that tackles the problem of
learning to classify medical dialogue into functional sections without
requiring a large number of annotations. Our approach combines pseudo-labeling
and human-in-the-loop. First, we bootstrap using weak supervision with
pseudo-labeling to generate dialogue turn-level pseudo-labels and train a
transformer-based model, which is then applied to individual sentences to
create noisy sentence-level labels. Second, we iteratively refine
sentence-level labels using a cluster-based human-in-the-loop approach. Each
iteration requires only a few dozen annotator decisions. We evaluate the
results on an expert-annotated dataset of 100 dialogues and find that while our
models start with 69.5% accuracy, we can iteratively improve it to 82.5%. The
code used to perform all experiments described in this paper can be found here:
https://github.com/curai/curai-research/tree/main/functional-sections.Comment: Changed the github link as it was invali
A Differentiable Generative Adversarial Network for Open Domain Dialogue
Paper presented at the IWSDS 2019: International Workshop on Spoken Dialogue Systems Technology, Siracusa, Italy, April 24-26, 2019This work presents a novel methodology to train open domain neural dialogue systems within the framework of Generative Adversarial Networks with gradient-based optimization methods. We avoid the non-differentiability related to text-generating networks approximating the word vector corresponding to each generated token via a top-k softmax. We show that a weighted average of the word vectors of the most probable tokens computed from the probabilities resulting of the top-k softmax leads to a good approximation of the word vector of the generated token. Finally we demonstrate through a human evaluation process that training a neural dialogue system via adversarial learning with this method successfully discourages it from producing generic responses. Instead it tends to produce more informative and variate ones.This work has been partially funded by the Basque Government under grant PRE_2017_1_0357, by the University of the Basque Country UPV/EHU under grant PIF17/310, and by the H2020 RIA EMPATHIC (Grant N: 769872)
PRESENCE: A human-inspired architecture for speech-based human-machine interaction
Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially and performance appears to be asymptotic to a level that may be inadequate for many real-world applications. This suggests that there may be a fundamental flaw in the underlying architecture of contemporary systems, as well as a failure to capitalize on the combinatorial properties of human spoken language. This paper addresses these issues and presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems. Called PRESENCE-"PREdictive SENsorimotor Control and Emulation" - this new architecture blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure. Cooperative and communicative behavior emerges as a by-product of an architecture that is founded on a model of interaction in which the system has in mind the needs and intentions of a user and a user has in mind the needs and intentions of the system
Mapping wisdom as a complex adaptive system
This is the second of two papers concerning wisdom as an ecosystem appearing in sequential editions of Management & Marketing journal. The notion of wisdom as an ecosystem, or "the wisdom ecology", builds on work by Hays (2007) who first identified wisdom as an organisational construct and proposed a dynamic model of it. The centrepiece of this and its former companion paper is a relationship map of the Wisdom Ecosystem (the Causal Loop Diagram at Figure 1). The first paper, "The Ecology of Wisdom", introduced readers to the topics of wisdom and complex adaptive systems, and presented a dynamic model of the Wisdom Ecosystem. This second paper discusses systems dynamics modelling (mapping systems) and covers the Wisdom Ecosystem model in detail. It describes the four domains, or subsystems, of the Wisdom Ecosystem, Dialogue, Communal Mind, Collective Intelligence, and Wisdom, and walks readers through the model, exploring each of its 25 elements in turn. It examines the relationships amongst system elements and illuminates important aspects of systems function, providing a rare tutorial on developing and using Causal Loop Diagrams.Causal Loop Diagramming, Complexity, Dialogue, Organisational Learning, Systems Dynamics, Wisdom.
Personalizing Dialogue Agents via Meta-Learning
Existing personalized dialogue models use human designed persona descriptions
to improve dialogue consistency. Collecting such descriptions from existing
dialogues is expensive and requires hand-crafted feature designs. In this
paper, we propose to extend Model-Agnostic Meta-Learning (MAML)(Finn et al.,
2017) to personalized dialogue learning without using any persona descriptions.
Our model learns to quickly adapt to new personas by leveraging only a few
dialogue samples collected from the same user, which is fundamentally different
from conditioning the response on the persona descriptions. Empirical results
on Persona-chat dataset (Zhang et al., 2018) indicate that our solution
outperforms non-meta-learning baselines using automatic evaluation metrics, and
in terms of human-evaluated fluency and consistency.Comment: Accepted in ACL 2019. Zhaojiang Lin* and Andrea Madotto* contributed
equally to this wor
- âŠ