Training Dialogue Systems With Human Advice

Barlier, Merwan; Laroche, Romain; Pietquin, Olivier

Training Dialogue Systems With Human Advice

Authors: Merwan Barlier
Romain Laroche
Olivier Pietquin
Publication date: 10 July 2018
Publisher: International Foundation for Autonomous Agents and MultiAgent Systems (IFAAMAS)

Abstract

International audienceOne major drawback of Reinforcement Learning (RL) Spoken Dialogue Systems is that they inherit from the general explorationrequirements of RL which makes them hard to deploy from an industry perspective. On the other hand, industrial systems rely onhuman expertise and hand written rules so as to avoid irrelevant behavior to happen and maintain acceptable experience from theuser point of view. In this paper, we attempt to bridge the gap between those two worlds by providing an easy way to incorporate allkinds of human expertise in the training phase of a Reinforcement Learning Dialogue System. Our approach, based on the TAMERframework, enables safe and efficient policy learning by combining the traditional Reinforcement Learning reward signal with anadditional reward, encoding expert advice. Experimental results show that our method leads to substantial improvements over moretraditional Reinforcement Learning methods