Search CORE

40,390 research outputs found

Automatic Speech Recognition in Air Traffic Control: a Human Factors Perspective

Author: Karlsson Joakim
Publication venue
Publication date: 01/12/1990
Field of study

The introduction of Automatic Speech Recognition (ASR) technology into the Air Traffic Control (ATC) system has the potential to improve overall safety and efficiency. However, because ASR technology is inherently a part of the man-machine interface between the user and the system, the human factors issues involved must be addressed. Here, some of the human factors problems are identified and related methods of investigation are presented. Research at M.I.T.'s Flight Transportation Laboratory is being conducted from a human factors perspective, focusing on intelligent parser design, presentation of feedback, error correction strategy design, and optimal choice of input modalities

NASA Technical Reports Server

New Technique to Enhance the Performance of Spoken Dialogue Systems by Means of Implicit Recovery of ASR Errors

Author: Griol David
López Cózar Ramón
Quesada Moreno José Francisco
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

This paper proposes a new technique to implicitly correct some ASR errors made by spoken dialogue systems, which is implemented at two levels: statistical and linguistic. The goal of the former level is to employ for the correction knowledge extracted from the analysis of a training corpus comprised of utterances and their corresponding ASR results. The outcome of the analysis is a set of syntactic-semantic models and a set of lexical models, which are optimally selected during the correction. The goal of the correction at the linguistic level is to repair errors not detected during the statistical level which affects the semantics of the sentences. Experiments carried out with a previouslydeveloped spoken dialogue system for the fast food domain indicate that the technique allows enhancing word accuracy, spoken language understanding and task completion by 8.5%, 16.54% and 44.17% absolute, respectively.Ministerio de Ciencia y Tecnología TIN2007-64718 HAD

idUS. Depósito de Investigación Universidad de Sevilla

Exploring miscommunication and collaborative behaviour in human-robot interaction

Author: Koulouri T
Lauria S
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

This paper presents the first step in designing a speech-enabled robot that is capable of natural management of miscommunication. It describes the methods and results of two WOz studies, in which dyads of naïve participants interacted in a collaborative task. The first WOz study explored human miscommunication management. The second study investigated how shared visual space and monitoring shape the processes of feedback and communication in task-oriented interactions. The results provide insights for the development of human-inspired and robust natural language interfaces in robots

CiteSeerX

Brunel University Research Archive

Self-imitating Feedback Generation Using GAN for Computer-Assisted Pronunciation Training

Author: Chung Minhwa
Yang Seung Hee
Publication venue
Publication date: 20/04/2019
Field of study

Self-imitating feedback is an effective and learner-friendly method for non-native learners in Computer-Assisted Pronunciation Training. Acoustic characteristics in native utterances are extracted and transplanted onto learner's own speech input, and given back to the learner as a corrective feedback. Previous works focused on speech conversion using prosodic transplantation techniques based on PSOLA algorithm. Motivated by the visual differences found in spectrograms of native and non-native speeches, we investigated applying GAN to generate self-imitating feedback by utilizing generator's ability through adversarial training. Because this mapping is highly under-constrained, we also adopt cycle consistency loss to encourage the output to preserve the global structure, which is shared by native and non-native utterances. Trained on 97,200 spectrogram images of short utterances produced by native and non-native speakers of Korean, the generator is able to successfully transform the non-native spectrogram input to a spectrogram with properties of self-imitating feedback. Furthermore, the transformed spectrogram shows segmental corrections that cannot be obtained by prosodic transplantation. Perceptual test comparing the self-imitating and correcting abilities of our method with the baseline PSOLA method shows that the generative approach with cycle consistency loss is promising

arXiv.org e-Print Archive

Crossref

SNU Open Repository and Archive

How the agent’s gender influence users’ evaluation of a QA system

Author: Dijk Betsy van
Hofs Dennis
Niculescu Andreea
Nijholt Anton
Publication venue: IEEE
Publication date: 01/01/2010
Field of study

In this paper we present the results of a pilot study investigating the effects of agents’ gender-ambiguous vs. gender-marked look on the perceived interaction quality of a multimodal question answering system. Eight test subjects interacted with three system agents, each having a feminine, masculine or gender-ambiguous look. The subjects were told each agent was representing a differently configured system. In fact, they were interacting with the same system. In the end, the subjects filled in an evaluation questionnaire and participated in an in-depth qualitative interview. The results showed that the user evaluation seemed to be influenced by the agent’s gender look: the system represented by the feminine agent achieved on average the highest evaluation scores. On the other hand, the system represented by the gender-ambiguous agent was systematically lower rated. This outcome might be relevant for an appropriate agent look, especially since many designers tend to develop gender-ambiguous characters for interactive interfaces to match various users’ preferences. However, additional empirical evidence is needed in the future to confirm our findings

CiteSeerX

Crossref

University of Twente Research Information

Recovering from non-understanding errors in a conversational system

Author: Matheson Colin
Matthew Hednerson
Oberlander Jon
Publication venue
Publication date: 01/01/2012
Field of study

Edinburgh Research Explorer

Commanding and re-dictation: Developing eyes-free voice-based interaction for editing dictated text

Author: GHOSH Debjyoti
HARA Kotaro
LIU Can
ZHAO Shengdong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2020
Field of study

Institutional Knowledge at Singapore Management University