Search CORE

180 research outputs found

Spoken Language Interaction with Robots: Recommendations for Future Research

Author: Kennington Casey
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2022
Field of study

With robotics rapidly advancing, more effective human–robot interaction is increasingly needed to realize the full potential of robots for society. While spoken language must be part of the solution, our ability to provide spoken language interaction capabilities is still very limited. In this article, based on the report of an interdisciplinary workshop convened by the National Science Foundation, we identify key scientific and engineering advances needed to enable effective spoken language interaction with robotics. We make 25 recommendations, involving eight general themes: putting human needs first, better modeling the social and interactive aspects of language, improving robustness, creating new methods for rapid adaptation, better integrating speech and language with other communication modalities, giving speech and language components access to rich representations of the robot’s current knowledge and state, making all components operate in real time, and improving research infrastructure and resources. Research and development that prioritizes these topics will, we believe, provide a solid foundation for the creation of speech-capable robots that are easy and effective for humans to work with

Spoken language interaction with robots: Recommendations for future research

Author: Alwan A.
Amant R.S.
Artzi Y.
Bansal M.
Blankenship G.
Chai J.
Daumé H.
Dey D.
Espy-Wilson C.
Harper M.
Howard T.
Kennington C.
Kruijff-Korbayová I.
Manocha D.
Marge M.
Matuszek C.
Mead R.
Mooney R.
Moore R.K.
Ostendorf M.
Pon-Barry H.
Rudnicky A.I.
Scheutz M.
Sun T.
Tellex S.
Traum D.
Ward N.G.
Yu Z.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

Author: Wasinger Rainer
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2006
Field of study

This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation präsentiert eine multimodale Architektur zum Gebrauch in mobilen Umständen wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen Umständen analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte Vorführungsprogramme zum \u27stand-alone\u27; Gebrauch auf mobilen Geräten entworfen. Von spezieller Wichtigkeit war der Entwurf und die Ausführung eines Modalitäts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die Ausführung erlaubt die Veränderung von Zuverlässigkeitswerten innerhalb einzelner Modalitäten und außerdem ermöglicht eine Methode um die semantisch überlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische Multimodalität sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben über die bevorzugte Art verschiedener Modalitätskombinationen, sowie auch über die Akzeptanz von anthropomorphisierten Objekten

Automatic Understanding of ATC Speech: Study of Prospectives and Field Experiments for Several Controller Positions

Author: Córdoba Herralde Ricardo de
D'haro Enríquez Luis Fernando
Fernández Martínez Fernando
Ferreiros López Javier
González Germán
Macías Guarasa Javier
Montero Martínez Juan Manuel
Pardo Muñoz José Manuel
Sama Valentin
San Segundo Hernández Rubén
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Although there has been a lot of interest in recognizing and understanding air traffic control (ATC) speech, none of the published works have obtained detailed field data results. We have developed a system able to identify the language spoken and recognize and understand sentences in both Spanish and English. We also present field results for several in-tower controller positions. To the best of our knowledge, this is the first time that field ATC speech (not simulated) is captured, processed, and analyzed. The use of stochastic grammars allows variations in the standard phraseology that appear in field data. The robust understanding algorithm developed has 95% concept accuracy from ATC text input. It also allows changes in the presentation order of the concepts and the correction of errors created by the speech recognition engine improving it by 17% and 25%, respectively, absolute in the percentage of fully correctly understood sentences for English and Spanish in relation to the percentages of fully correctly recognized sentences. The analysis of errors due to the spontaneity of the speech and its comparison to read speech is also carried out. A 96% word accuracy for read speech is reduced to 86% word accuracy for field ATC data for Spanish for the "clearances" task confirming that field data is needed to estimate the performance of a system. A literature review and a critical discussion on the possibilities of speech recognition and understanding technology applied to ATC speech are also given

Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

Author: Wasinger Rainer
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 17/01/2008
Field of study

This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation präsentiert eine multimodale Architektur zum Gebrauch in mobilen Umständen wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen Umständen analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte Vorführungsprogramme zum 'stand-alone'; Gebrauch auf mobilen Geräten entworfen. Von spezieller Wichtigkeit war der Entwurf und die Ausführung eines Modalitäts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die Ausführung erlaubt die Veränderung von Zuverlässigkeitswerten innerhalb einzelner Modalitäten und außerdem ermöglicht eine Methode um die semantisch überlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische Multimodalität sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben über die bevorzugte Art verschiedener Modalitätskombinationen, sowie auch über die Akzeptanz von anthropomorphisierten Objekten

Evaluating performance for procurement: A structured method for assessing the usability of future speech interfaces

Author: Martin Andrew Cruickshank Life
Publication venue: UCL (University College London)
Publication date: 01/01/1991
Field of study

Procurement is a process by which organizations acquire equipment to enhance the effectiveness of their operations. Equipment will only enhance effectiveness if it is usable for its purpose in the work environment, i.e. if it enables tasks to be performed to the desired quality with acceptable costs to those who operate it. Procurement presents a requirement, then, for evaluations of the performance of human-machine work systems. This thesis is concerned with the provision of information to support procurers in performing such evaluations. The Ministry of Defence (an equipment procurer) has presented a particular requirement for a means of assessing the usability of speech interfaces in the establishment of the feasibility of computerized battlefield work systems. A structured method was developed to meet this requirement, the scope, notation and process of which sought to be explicit and proceduralized. The scope was specified in terms of a conceptualization of human-computer interaction: the method supported the development of representations of the task, device and user, which could be implemented as simulations and used in empirical evaluations of system performance. Notations for representations were proposed, and procedures enabling the use of the notations. The specification and implementation of the four sub-methods is described, and subsequent enhancement in the context of evaluations of speech interfaces for battlefield observation tasks. The complete method is presented. An evaluation of the method was finally performed with respect to the quality of the assessment output and costs to the assessor. The results suggested that the method facilitated systematic assessment, although some inadequacies were identified in the expression of diagnostic information which was recruited by the procedures, and in some of the procedures themselves. The research offers support for the use of structured human factors evaluation methods in procurement. Qualifications relate to the appropriate expression of knowledge of device-user interaction, and to the conflict between requirements for flexibility and low-level proceduralization

Proceedings of QG2010: The Third Workshop on Question Generation

Author: Boyer Kristy Elizabeth
Piwek Paul
Publication venue: questiongeneration.org
Publication date: 18/06/2010
Field of study

These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge". QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)