Search CORE

4,828 research outputs found

User Simulation in Dialogue Systems using Inverse Reinforcement Learning

Author: Chandramohan Senthilkumar
Geist Matthieu
Lefèvre Fabrice
Pietquin Olivier
Publication venue: HAL CCSD
Publication date: 27/08/2011
Field of study

International audienceSpoken Dialogue Systems (SDS) are man-machine interfaces which use natural language as the medium of interaction. Dialogue corpora collection for the purpose of training and evaluating dialogue systems is an expensive process. User simulators aim at simulating human users in order to generate synthetic data. Existing methods for user simulation mainly focus on generating data with the same statistical consistency as in some reference dialogue corpus. This paper outlines a novel approach for user simulation based on Inverse Reinforcement Learning (IRL). The task of building the user simulator is perceived as a task of imitation learning

HAL-CentraleSupelec

CiteSeerX

HAL-Rennes 1

Improving Search through A3C Reinforcement Learning based Conversational Agent

Author: EL Deci
G Shani
H Cuayhuitl
H Cuayáhuitl
J Wei
JS Bridle
RS Sutton
S Hochreiter
Publication venue
Publication date: 19/08/2018
Field of study

We develop a reinforcement learning based search assistant which can assist users through a set of actions and sequence of interactions to enable them realize their intent. Our approach caters to subjective search where the user is seeking digital assets such as images which is fundamentally different from the tasks which have objective and limited search modalities. Labeled conversational data is generally not available in such search tasks and training the agent through human interactions can be time consuming. We propose a stochastic virtual user which impersonates a real user and can be used to sample user behavior efficiently to train the agent which accelerates the bootstrapping of the agent. We develop A3C algorithm based context preserving architecture which enables the agent to provide contextual assistance to the user. We compare the A3C agent with Q-learning and evaluate its performance on average rewards and state values it obtains with the virtual user in validation episodes. Our experiments show that the agent learns to achieve higher rewards and better states.Comment: 17 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Apprentissage par Renforcement Inverse pour la Simulation d'Utilisateurs dans les Systèmes de Dialogue

Author: Chandramohan Senthilkumar
Geist Matthieu
Pietquin Olivier
Publication venue: HAL CCSD
Publication date: 23/06/2011
Field of study

National audienceLes systèmes de dialogue sont des interfaces homme-machine qui utilisent le language naturel comme medium d'interaction. La simulation d'utilisateurs a pour objectif de simuler le comportement d'un utilisateur humain afin de générer artificiellement des dialogues. Cette étape est souvent essentielle dans la mesure où collecter et annoter des corpus de dialogues est un processus coûteux, bien que nécessaire à l'utilisation de méthodes d'apprentissage artificiel (tel l'apprentissage par renforcement qui peut être utilisé pour apprendre la politique du gestionnaire de dialogues). Les simulateurs d'utilisateurs existants cherchent essentiellement à produire des comportements d'utilisateurs qui soient statistiquement consistants avec le corpus de dialogues. La contribution de cet article est d'utiliser l'apprentissage par renforcement inverse pour bâtir un nouveau simulateur d'utilisateur. Cette nouvelle approche est illustrée par la simulation du comportement d'un modèle d'utilisateur (artificiel) sur un problème à trois attributs pour un système d'information touristiques. Le comportement du nouveau simulateur d'utilisateur est évalué selon plusieurs métriques (de l'interaction au dialogue)

HAL-CentraleSupelec

HAL-Rennes 1