40,385 research outputs found
Automatic Speech Recognition in Air Traffic Control: a Human Factors Perspective
The introduction of Automatic Speech Recognition (ASR) technology into the Air Traffic Control (ATC) system has the potential to improve overall safety and efficiency. However, because ASR technology is inherently a part of the man-machine interface between the user and the system, the human factors issues involved must be addressed. Here, some of the human factors problems are identified and related methods of investigation are presented. Research at M.I.T.'s Flight Transportation Laboratory is being conducted from a human factors perspective, focusing on intelligent parser design, presentation of feedback, error correction strategy design, and optimal choice of input modalities
New Technique to Enhance the Performance of Spoken Dialogue Systems by Means of Implicit Recovery of ASR Errors
This paper proposes a new technique to implicitly correct some ASR
errors made by spoken dialogue systems, which is implemented at two levels:
statistical and linguistic. The goal of the former level is to employ for the correction
knowledge extracted from the analysis of a training corpus comprised of
utterances and their corresponding ASR results. The outcome of the analysis is
a set of syntactic-semantic models and a set of lexical models, which are optimally
selected during the correction. The goal of the correction at the linguistic
level is to repair errors not detected during the statistical level which affects the
semantics of the sentences. Experiments carried out with a previouslydeveloped
spoken dialogue system for the fast food domain indicate that the
technique allows enhancing word accuracy, spoken language understanding and
task completion by 8.5%, 16.54% and 44.17% absolute, respectively.Ministerio de Ciencia y Tecnología TIN2007-64718 HAD
Exploring miscommunication and collaborative behaviour in human-robot interaction
This paper presents the first step in designing a speech-enabled robot that is capable of natural management of miscommunication. It describes the methods
and results of two WOz studies, in which
dyads of naïve participants interacted in a
collaborative task. The first WOz study
explored human miscommunication
management. The second study investigated
how shared visual space and monitoring
shape the processes of feedback and communication in task-oriented interactions.
The results provide insights for the development of human-inspired and
robust natural language interfaces in robots
Self-imitating Feedback Generation Using GAN for Computer-Assisted Pronunciation Training
Self-imitating feedback is an effective and learner-friendly method for
non-native learners in Computer-Assisted Pronunciation Training. Acoustic
characteristics in native utterances are extracted and transplanted onto
learner's own speech input, and given back to the learner as a corrective
feedback. Previous works focused on speech conversion using prosodic
transplantation techniques based on PSOLA algorithm. Motivated by the visual
differences found in spectrograms of native and non-native speeches, we
investigated applying GAN to generate self-imitating feedback by utilizing
generator's ability through adversarial training. Because this mapping is
highly under-constrained, we also adopt cycle consistency loss to encourage the
output to preserve the global structure, which is shared by native and
non-native utterances. Trained on 97,200 spectrogram images of short utterances
produced by native and non-native speakers of Korean, the generator is able to
successfully transform the non-native spectrogram input to a spectrogram with
properties of self-imitating feedback. Furthermore, the transformed spectrogram
shows segmental corrections that cannot be obtained by prosodic
transplantation. Perceptual test comparing the self-imitating and correcting
abilities of our method with the baseline PSOLA method shows that the
generative approach with cycle consistency loss is promising
How the agent’s gender influence users’ evaluation of a QA system
In this paper we present the results of a pilot study investigating the effects of agents’ gender-ambiguous vs. gender-marked look on the perceived interaction quality of a multimodal question answering system. Eight test subjects interacted with three system agents, each having a feminine, masculine or gender-ambiguous look. The subjects were told each agent was representing a differently configured system. In fact, they were interacting with the same system. In the end, the subjects filled in an evaluation questionnaire and participated in an in-depth qualitative interview. The results showed that the user evaluation seemed to be influenced by the agent’s gender look: the system represented by the feminine agent achieved on average the highest evaluation scores. On the other hand, the system represented by the gender-ambiguous agent was systematically lower rated. This outcome might be relevant for an appropriate agent look, especially since many designers tend to develop gender-ambiguous characters for interactive interfaces to match various users’ preferences. However, additional empirical evidence is needed in the future to confirm our findings
- …