21,350 research outputs found
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems
We present a new algorithm that significantly improves the efficiency of
exploration for deep Q-learning agents in dialogue systems. Our agents explore
via Thompson sampling, drawing Monte Carlo samples from a Bayes-by-Backprop
neural network. Our algorithm learns much faster than common exploration
strategies such as -greedy, Boltzmann, bootstrapping, and
intrinsic-reward-based ones. Additionally, we show that spiking the replay
buffer with experiences from just a few successful episodes can make Q-learning
feasible when it might otherwise fail.Comment: 13 pages, 9 figure
Survey on Evaluation Methods for Dialogue Systems
In this paper we survey the methods and concepts developed for the evaluation
of dialogue systems. Evaluation is a crucial part during the development
process. Often, dialogue systems are evaluated by means of human evaluations
and questionnaires. However, this tends to be very cost and time intensive.
Thus, much work has been put into finding methods, which allow to reduce the
involvement of human labour. In this survey, we present the main concepts and
methods. For this, we differentiate between the various classes of dialogue
systems (task-oriented dialogue systems, conversational dialogue systems, and
question-answering dialogue systems). We cover each class by introducing the
main technologies developed for the dialogue systems and then by presenting the
evaluation methods regarding this class
Generic dialogue modeling for multi-application dialogue systems
We present a novel approach to developing interfaces for multi-application dialogue systems. The targeted interfaces allow transparent switching between a large number of applications within one system. The approach, based on the Rapid Dialogue Prototyping Methodology (RDPM) and the Vector Space model techniques from Information Retrieval, is composed of three main steps: (1) producing finalized dia
logue models for applications using the RDPM, (2) designing an application interaction hierarchy, and (3) navigating between the applications based on the user's application of interest
- …