12 research outputs found

    A Comparative Evaluation Methodology for NLG in Interactive Systems

    Get PDF
    Interactive systems have become an increasingly important type of application for deployment of NLG technology over recent years. At present, we do not yet have commonly agreed terminology or methodology for evaluating NLG within interactive systems. In this paper, we take steps towards addressing this gap by presenting a set of principles for designing new evaluations in our comparative evaluation methodology. We start with presenting a categorisation framework, giving an overview of different categories of evaluation measures, in order to provide standard terminology for categorising existing and new evaluation techniques. Background on existing evaluation methodologies for NLG and interactive systems is presented. The comparative evaluation methodology is presented. Finally, a methodology for comparative evaluation of NLG components embedded within interactive systems is presented in terms of the comparative evaluation methodology, using a specific task for illustrative purposes

    Survey on Evaluation Methods for Dialogue Systems

    Get PDF
    In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class

    Do You Mind? User Perceptions of Machine Consciousness

    Get PDF
    The prospect of machine consciousness cultivates controversy across media, academia, and industry. Assessing whether non-experts perceive technologies as conscious, and exploring the consequences of this perception, are yet unaddressed challenges in Human Computer Interaction (HCI). To address them, we surveyed 100 people, exploring their conceptualisations of consciousness and if and how they perceive consciousness in currently available interactive technologies. We show that many people already perceive a degree of consciousness in GPT-3, a voice chat bot, and a robot vacuum cleaner. Within participant responses we identified dynamic tensions between denial and speculation, thinking and feeling, interaction and experience, control and independence, and rigidity and spontaneity. These tensions can inform future research into perceptions of machine consciousness and the challenges it represents for HCI. With both empirical and theoretical contributions, this paper emphasises the importance of HCI in an era of machine consciousness, real, perceived or denied

    Do You Mind? User Perceptions of Machine Consciousness

    Get PDF
    The prospect of machine consciousness cultivates controversy across media, academia, and industry. Assessing whether non-experts perceive technologies as conscious, and exploring the consequences of this perception, are yet unaddressed challenges in Human Computer Interaction (HCI). To address them, we surveyed 100 people, exploring their conceptualisations of consciousness and if and how they perceive consciousness in currently available interactive technologies. We show that many people already perceive a degree of consciousness in GPT-3, a voice chat bot, and a robot vacuum cleaner. Within participant responses we identified dynamic tensions between denial and speculation, thinking and feeling, interaction and experience, control and independence, and rigidity and spontaneity. These tensions can inform future research into perceptions of machine consciousness and the challenges it represents for HCI. With both empirical and theoretical contributions, this paper emphasises the importance of HCI in an era of machine consciousness, real, perceived or denied

    Real user evaluation of spoken dialogue systems using Amazon Mechanical Turk

    No full text
    This paper describes a framework for evaluation of spoken dialogue systems. Typically, evaluation of dialogue systems is performed in a controlled test environment with carefully selected and instructed users. However, this approach is very demanding. An alternative is to recruit a large group of users who evaluate the dialogue systems in a remote setting under virtually no supervision. Crowdsourcing technology, for example Amazon Mechanical Turk (AMT), provides an efficient way of recruiting subjects. This paper describes an evaluation framework for spoken dialogue systems using AMT users and compares the obtained results with a recent trial in which the systems were tested by locally recruited users. The results suggest that the use of crowdsourcing technology is feasible and it can provide reliable results. Copyright © 2011 ISCA

    Producing Acoustic-Prosodic Entrainment in a Robotic Learning Companion to Build Learner Rapport

    Get PDF
    abstract: With advances in automatic speech recognition, spoken dialogue systems are assuming increasingly social roles. There is a growing need for these systems to be socially responsive, capable of building rapport with users. In human-human interactions, rapport is critical to patient-doctor communication, conflict resolution, educational interactions, and social engagement. Rapport between people promotes successful collaboration, motivation, and task success. Dialogue systems which can build rapport with their user may produce similar effects, personalizing interactions to create better outcomes. This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of speech, such as tone of voice or loudness, to one another over the course of a conversation. Correlated with liking and task success, a dialogue system which entrains may enhance rapport. Entrainment, however, is very challenging to model. People entrain on different features in many ways and how to design entrainment to build rapport is unclear. The first goal of this dissertation is to explore how acoustic-prosodic entrainment can be modeled to build rapport. Towards this goal, this work presents a series of studies comparing, evaluating, and iterating on the design of entrainment, motivated and informed by human-human dialogue. These models of entrainment are implemented in the dialogue system of a robotic learning companion. Learning companions are educational agents that engage students socially to increase motivation and facilitate learning. As a learning companion’s ability to be socially responsive increases, so do vital learning outcomes. A second goal of this dissertation is to explore the effects of entrainment on concrete outcomes such as learning in interactions with robotic learning companions. This dissertation results in contributions both technical and theoretical. Technical contributions include a robust and modular dialogue system capable of producing prosodic entrainment and other socially-responsive behavior. One of the first systems of its kind, the results demonstrate that an entraining, social learning companion can positively build rapport and increase learning. This dissertation provides support for exploring phenomena like entrainment to enhance factors such as rapport and learning and provides a platform with which to explore these phenomena in future work.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    corecore