Search CORE

7,300 research outputs found

Recognizing Uncertainty in Speech

Author: Pon-Barry Heather
Shieber Stuart M.
Publication venue: 'Hindawi Limited'
Publication date: 01/12/2010
Field of study

We address the problem of inferring a speaker's level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. We show that using phrase-level prosodic features centered around the phrases causing uncertainty, in addition to utterance-level prosodic features, improves our model's level of certainty classification. In addition, our models can be used to predict which phrase a person is uncertain about. These results rely on a novel method for eliciting utterances of varying levels of certainty that allows us to compare the utility of contextually-based feature sets. We elicit level of certainty ratings from both the speakers themselves and a panel of listeners, finding that there is often a mismatch between speakers' internal states and their perceived states, and highlighting the importance of this distinction.Comment: 11 page

arXiv.org e-Print Archive

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

When Does Disengagement Correlate with Performance in Spoken Dialog Computer Tutoring?

Author: Forbes-Riley Kate
Litman Diane J
Publication venue: 'IOS Press'
Publication date: 01/01/2012
Field of study

In this paper we investigate how student disengagement relates to two performance metrics in a spoken dialog computer tutoring corpus, both when disengagement is measured through manual annotation by a trained human judge, and also when disengagement is measured through automatic annotation by the system based on a machine learning model. First, we investigate whether manually labeled overall disengagement and six different disengagement types are predictive of learning and user satisfaction in the corpus. Our results show that although students’ percentage of overall disengaged turns negatively correlates both with the amount they learn and their user satisfaction, the individual types of disengagement correlate differently: some negatively correlate with learning and user satisfaction, while others don’t correlate with eithermetric at all. Moreover, these relationships change somewhat depending on student prerequisite knowledge level. Furthermore, using multiple disengagement types to predict learning improves predictive power. Overall, these manual label-based results suggest that although adapting to disengagement should improve both student learning and user satisfaction in computer tutoring, maximizing performance requires the system to detect and respond differently based on disengagement type. Next, we present an approach to automatically detecting and responding to user disengagement types based on their differing correlations with correctness. Investigation of ourmachine learningmodel of user disengagement shows that its automatic labels negatively correlate with both performance metrics in the same way as the manual labels. The similarity of the correlations across the manual and automatic labels suggests that the automatic labels are a reasonable substitute for the manual labels. Moreover, the significant negative correlations themselves suggest that redesigning ITSPOKE to automatically detect and respond to disengagement has the potential to remediate disengagement and thereby improve performance, even in the presence of noise introduced by the automatic detection process

D-Scholarship@Pitt

Low-level grounding in a multimodal mobile service robot conversational system using graphical models

Author: Alexander Anil
Drygajlo Andrzej
Prodanov Plamen
Richiardi Jonas
Publication venue
Publication date: 18/06/2018
Field of study

The main task of a service robot with a voice-enabled communication interface is to engage a user in dialogue providing an access to the services it is designed for. In managing such interaction, inferring the user goal (intention) from the request for a service at each dialogue turn is the key issue. In service robot deployment conditions speech recognition limitations with noisy speech input and inexperienced users may jeopardize user goal identification. In this paper, we introduce a grounding state-based model motivated by reducing the risk of communication failure due to incorrect user goal identification. The model exploits the multiple modalities available in the service robot system to provide evidence for reaching grounding states. In order to handle the speech input as sufficiently grounded (correctly understood) by the robot, four proposed states have to be reached. Bayesian networks combining speech and non-speech modalities during user goal identification are used to estimate probability that each grounding state has been reached. These probabilities serve as a base for detecting whether the user is attending to the conversation, as well as for deciding on an alternative input modality (e.g., buttons) when the speech modality is unreliable. The Bayesian networks used in the grounding model are specially designed for modularity and computationally efficient inference. The potential of the proposed model is demonstrated comparing a conversational system for the mobile service robot RoboX employing only speech recognition for user goal identification, and a system equipped with multimodal grounding. The evaluation experiments use component and system level metrics for technical (objective) and user-based (subjective) evaluation with multimodal data collected during the conversations of the robot RoboX with user

RERO DOC Digital Library

Recommended from our members

The Importance of Sub-Utterance Prosody in Predicting Level of Certainty

Author: Pon-Barry Heather Roberta
Shieber Stuart M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 22/02/2011
Field of study

We present an experiment aimed at understanding how to optimally use acoustic and prosodic information to predict a speaker's level of certainty. With a corpus of utterances where we can isolate a single word or phrase that is responsible for the speaker's level of certainty we use different sets of sub-utterance prosodic features to train models for predicting an utterance's perceived level of certainty. Our results suggest that using prosodic features of the word or phrase responsible for the level of certainty and of its surrounding context improves the prediction accuracy without increasing the total number of features when compared to using only features taken from the utterance as a whole.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Identifying Uncertain Words within an Utterance via Prosodic Features

Author: Pon-Barry Heather Roberta
Shieber Stuart M.
Publication venue: 'International Speech Communication Association'
Publication date: 22/02/2011
Field of study

We describe an experiment that investigates whether sub-utterance prosodic features can be used to detect uncertainty at the wordlevel. That is, given an utterance that is classified as uncertain, we want to determine which word or phrase the speaker is uncertain about. We have a corpus of utterances spoken under varying degrees of certainty. Using combinations of sub-utterance prosodic features we train models to predict the level of certainty of an utterance. On a set of utterances that were perceived to be uncertain, we compare the predictions of our models for two candidate target word segmentations: (a) one with the actual word causing uncertainty as the proposed target word, and (b) one with a control word as the proposed target word. Our best model correctly identifies the word causing the uncertainty rather than the control word 91% of the time.Engineering and Applied Science

Harvard University - DASH

Exploring affect-context dependencies for adaptive system development

Author: Diane J. Litman
Joel Tetreault
Kate Forbes-riley
Mihai Rotaru
Publication venue
Publication date: 01/01/2007
Field of study

We use χ2 to investigate the context dependency of student affect in our computer tutoring dialogues, targeting uncertainty in student answers in 3 automatically monitorable contexts. Our results show significant dependencies between uncertain answers and specific contexts. Identification and analysis of these dependencies is our first step in developing an adaptive version of our dialogue system.

CiteSeerX

Crossref

D-Scholarship@Pitt

Content, Social, and Metacognitive Statements: An Empirical Study Comparing Human-Human and Human-Computer Tutorial Dialogue

Author: Campbell Gwendolyn E.
Dzikovska Myroslava O.
Harrison Katherine M.
Moore Johanna D.
Steinhauser Natalie B.
Taylor Leanne S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We present a study which compares human-human computer-mediated tutoring with two computer tutoring systems based on the same materials but differing in the type of feedback they provide. Our results show that there are significant differences in interaction style between human-human and human-computer tutoring, as well as between the two computer tutors, and that different dialogue characteristics predict learning gain in different conditions. We show that there are significant differences in the non-content statements that students make to human and computer tutors, but also to different types of computer tutors. These differences also affect which factors are correlated with learning gain and user satisfaction. We argue that ITS designers should pay particular attention to strategies for dealing with negative social and metacognitive statements, and also conduct further research on how interaction style affects human-computer tutoring. © 2010 Springer-Verlag Berlin Heidelberg

Edinburgh Research Explorer

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Towards the Use of Dialog Systems to Facilitate Inclusive Education

Author: Callejas Zoraida
Griol Barres David
Molina López José Manuel
Sanchis de Miguel María Araceli
Publication venue: 'IGI Global'
Publication date: 01/01/2013
Field of study

Continuous advances in the development of information technologies have currently led to the possibility of accessing learning contents from anywhere, at anytime, and almost instantaneously. However, accessibility is not always the main objective in the design of educative applications, specifically to facilitate their adoption by disabled people. Different technologies have recently emerged to foster the accessibility of computers and new mobile devices, favoring a more natural communication between the student and the developed educative systems. This chapter describes innovative uses of multimodal dialog systems in education, with special emphasis in the advantages that they provide for creating inclusive applications and learning activities

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Adapting the use of attributes to the task environment in joint action: results and a model

Author: Bard Ellen
Guhe Markus
Publication venue
Publication date: 01/06/2008
Field of study

Edinburgh Research Explorer