Search CORE

2,546 research outputs found

Object Referring in Visual Scene with Spoken Language

Author: Dai Dengxin
Van Gool Luc
Vasudevan Arun Balajee
Publication venue
Publication date: 05/12/2017
Field of study

Object referring has important applications, especially for human-machine interaction. While having received great attention, the task is mainly attacked with written language (text) as input rather than spoken language (speech), which is more natural. This paper investigates Object Referring with Spoken Language (ORSpoken) by presenting two datasets and one novel approach. Objects are annotated with their locations in images, text descriptions and speech descriptions. This makes the datasets ideal for multi-modality learning. The approach is developed by carefully taking down ORSpoken problem into three sub-problems and introducing task-specific vision-language interactions at the corresponding levels. Experiments show that our method outperforms competing methods consistently and significantly. The approach is also evaluated in the presence of audio noise, showing the efficacy of the proposed vision-language interaction methods in counteracting background noise.Comment: 10 pages, Submitted to WACV 201

arXiv.org e-Print Archive

Advances in Human-Robot Interaction

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Rapid advances in the field of robotics have made it possible to use robots not just in industrial automation but also in entertainment, rehabilitation, and home service. Since robots will likely affect many aspects of human existence, fundamental questions of human-robot interaction must be formulated and, if at all possible, resolved. Some of these questions are addressed in this collection of papers by leading HRI researchers

A Proposal for Semantic Map Representation and Evaluation

Author: Capobianco Roberto
Dichtl J
Grisetti Giorgio
Iocchi Luca
Nardi Daniele
Serafin Jacopo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Semantic mapping is the incremental process of “mapping” relevant information of the world (i.e., spatial information, temporal events, agents and actions) to a formal description supported by a reasoning engine. Current research focuses on learning the semantic of environments based on their spatial location, geometry and appearance. Many methods to tackle this problem have been proposed, but the lack of a uniform representation, as well as standard benchmarking suites, prevents their direct comparison. In this paper, we propose a standardization in the representation of semantic maps, by defining an easily extensible formalism to be used on top of metric maps of the environments. Based on this, we describe the procedure to build a dataset (based on real sensor data) for benchmarking semantic mapping techniques, also hypothesizing some possible evaluation metrics. Nevertheless, by providing a tool for the construction of a semantic map ground truth, we aim at the contribution of the scientific community in acquiring data for populating the dataset

Archivio della ricerca- Università di Roma La Sapienza

A Developmental Neuro-Robotics Approach for Boosting the Recognition of Handwritten Digits

Author: Di Nuovo Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2020
Field of study

Developmental psychology and neuroimaging research identified a close link between numbers and fingers, which can boost the initial number knowledge in children. Recent evidence shows that a simulation of the children's embodied strategies can improve the machine intelligence too. This article explores the application of embodied strategies to convolutional neural network models in the context of developmental neurorobotics, where the training information is likely to be gradually acquired while operating rather than being abundant and fully available as the classical machine learning scenarios. The experimental analyses show that the proprioceptive information from the robot fingers can improve network accuracy in the recognition of handwritten Arabic digits when training examples and epochs are few. This result is comparable to brain imaging and longitudinal studies with young children. In conclusion, these findings also support the relevance of the embodiment in the case of artificial agents’ training and show a possible way for the humanization of the learning process, where the robotic body can express the internal processes of artificial intelligence making it more understandable for humans

arXiv.org e-Print Archive

Exploring miscommunication and collaborative behaviour in human-robot interaction

Author: Koulouri T
Lauria S
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

This paper presents the first step in designing a speech-enabled robot that is capable of natural management of miscommunication. It describes the methods and results of two WOz studies, in which dyads of naïve participants interacted in a collaborative task. The first WOz study explored human miscommunication management. The second study investigated how shared visual space and monitoring shape the processes of feedback and communication in task-oriented interactions. The results provide insights for the development of human-inspired and robust natural language interfaces in robots

CiteSeerX

Brunel University Research Archive