Search CORE

2,727 research outputs found

Survey on Evaluation Methods for Dialogue Systems

Author: Agirre Eneko
Cieliebak Mark
Deriu Jan
Echegoyen Guillermo
Otegi Arantxa
Rodrigo Alvaro
Rosset Sophie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class

arXiv.org e-Print Archive

ZHAW digitalcollection

ProDial – an annotated proactive dialogue act corpus for conversational assistants using crowdsourcing

Author: Kraus Matthias
Minker Wolfgang
Wagner Nicolas
Publication venue
Publication date: 27/01/2023
Field of study

Proactive behaviour is an integral interaction concept of both human-human as well as human-computer cooperation. However, modelling proactive systems and appropriate interaction strategies are still an open quest. In this work, a parameterised and annotated dialogue corpus has been created. The corpus is based on human interactions with an autonomous agent embedded in a serious game setting. For modelling proactive dialogue behaviour, the agent was capable of selecting from four different proactive actions (None, Notification, Suggestion, Intervention) in order to serve as the user’s personal advisor in a sequential planning task. Data was collected online using crowdsourcing (308 participants) resulting in a total of 3696 system-user exchanges. Data was annotated with objective features as well as subjectively self-reported features for capturing the interplay between proactive behaviour and situational as well as user-dependent characteristics. The corpus is intended for building a user model for developing trustworthy proactive interaction strategies

OPUS Augsburg

An Analysis of Mixed Initiative and Collaboration in Information-Seeking Dialogues

Author: de Rijke Maarten
Kanoulas Evangelos
Vakulenko Svitlana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

The ability to engage in mixed-initiative interaction is one of the core requirements for a conversational search system. How to achieve this is poorly understood. We propose a set of unsupervised metrics, termed ConversationShape, that highlights the role each of the conversation participants plays by comparing the distribution of vocabulary and utterance types. Using ConversationShape as a lens, we take a closer look at several conversational search datasets and compare them with other dialogue datasets to better understand the types of dialogue interaction they represent, either driven by the information seeker or the assistant. We discover that deviations from the ConversationShape of a human-human dialogue of the same type is predictive of the quality of a human-machine dialogue.Comment: SIGIR 2020 short conference pape

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Becoming JILDA

Author: Alessandro Lenci
Bernardo Magnini
Irene Sucameli
Manuela Speranza
Maria Simi
Publication venue: place:Torino, Italy
Publication date: 01/01/2020
Field of study

The difficulty in finding use-ful dialogic data to train a conversationalagent is an open issue even nowadays,when chatbots and spoken dialogue sys-tems are widely used. For this reason wedecided to build JILDA, a novel data col-lection of chat-based dialogues, producedby Italian native speakers and related to thejob-offer domain. JILDA is the first dia-logue collection related to this domain forthe Italian language. Because of its collec-tion modalities, we believe that JILDA canbe a useful resource not only for the Italianresearch community, but also for the inter-national one

Archivio della Ricerca - Università di Pisa

OpenEdition

A Large-Scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search

Author: de Rijke M.
Kanoulas E.
Vakulenko S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Evaluating Conversational Recommender Systems via User Simulation

Author: Aliannejadi Mohammad
Azzopardi Leif
Belz Anja
Griol David
Li Xiujun
Maxwell Harper F.
Schatzmann Jost
Serban Iulian Vlad
Zhang Yongfeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/06/2020
Field of study

Conversational information access is an emerging research area. Currently, human evaluation is used for end-to-end system evaluation, which is both very time and resource intensive at scale, and thus becomes a bottleneck of progress. As an alternative, we propose automated evaluation by means of simulating users. Our user simulator aims to generate responses that a real human would give by considering both individual preferences and the general flow of interaction with the system. We evaluate our simulation approach on an item recommendation task by comparing three existing conversational recommender systems. We show that preference modeling and task-specific interaction models both contribute to more realistic simulations, and can help achieve high correlation between automatic evaluation measures and manual human assessments.Comment: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '20), 202

arXiv.org e-Print Archive

Crossref