Search CORE

3 research outputs found

Learning Health-Bots from Training Data that was Automatically Created using Paraphrase Detection and Expert Knowledge

Author: Durand-Salmon Alexandre
Gardent Claire
Jolivet Philippe
Liednikova Anna
Publication venue: HAL CCSD
Publication date: 08/12/2020
Field of study

International audienceA key bottleneck for developing dialog models is the lack of adequate training data. Due to privacy issues, dialog data is even scarcer in the health domain. We propose a novel method for creating dialog corpora which we apply to create doctor-patient interaction data. We use this data to learn both a generation and a hybrid classification/retrieval model and find that the generation model consistently outperforms the hybrid model. We show that our data creation method has several advantages. Not only does it allow for the semi-automatic creation of large quantities of training data. It also provides a natural way of guiding learning and a novel method for assessing the quality of human-machine interactions

INRIA a CCSD electronic archive server

Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play

Author: Guerini M.
Majumdar S.
Tekiroglu S. S.
Publication venue
Publication date
Field of study

End-to-end neural approaches are becoming increasingly common in conversational scenarios due to their promising performances when provided with sufficient amount of data. In this paper, we present a novel methodology to address the interpretability of neural approaches in such scenarios by creating challenge datasets using dialogue self-play over multiple tasks/intents. Dialogue self-play allows generating large amount of synthetic data; by taking advantage of the complete control over the generation process, we show how neural approaches can be evaluated in terms of unseen dialogue patterns. We propose several out-of-pattern test cases each of which introduces a natural and unexpected user utterance phenomenon. As a proof of concept, we built a single and a multiple memory network, and show that these two architectures have diverse performances depending on the peculiar dialogue patterns

Archivio della ricerca - Fondazione Bruno Kessler