207 research outputs found
A Review of Verbal and Non-Verbal Human-Robot Interactive Communication
In this paper, an overview of human-robot interactive communication is
presented, covering verbal as well as non-verbal aspects of human-robot
interaction. Following a historical introduction, and motivation towards fluid
human-robot communication, ten desiderata are proposed, which provide an
organizational axis both of recent as well as of future research on human-robot
communication. Then, the ten desiderata are examined in detail, culminating to
a unifying discussion, and a forward-looking conclusion
Audio-Motor Integration for Robot Audition
International audienceIn the context of robotics, audio signal processing in the wild amounts to dealing with sounds recorded by a system that moves and whose actuators produce noise. This creates additional challenges in sound source localization, signal enhancement and recognition. But the speci-ficity of such platforms also brings interesting opportunities: can information about the robot actuators' states be meaningfully integrated in the audio processing pipeline to improve performance and efficiency? While robot audition grew to become an established field, methods that explicitly use motor-state information as a complementary modality to audio are scarcer. This chapter proposes a unified view of this endeavour, referred to as audio-motor integration. A literature review and two learning-based methods for audio-motor integration in robot audition are presented, with application to single-microphone sound source localization and ego-noise reduction on real data
Vision-Guided Robot Hearing
International audienceNatural human-robot interaction (HRI) in complex and unpredictable environments is important with many potential applicatons. While vision-based HRI has been thoroughly investigated, robot hearing and audio-based HRI are emerging research topics in robotics. In typical real-world scenarios, humans are at some distance from the robot and hence the sensory (microphone) data are strongly impaired by background noise, reverberations and competing auditory sources. In this context, the detection and localization of speakers plays a key role that enables several tasks, such as improving the signal-to-noise ratio for speech recognition, speaker recognition, speaker tracking, etc. In this paper we address the problem of how to detect and localize people that are both seen and heard. We introduce a hybrid deterministic/probabilistic model. The deterministic component allows us to map 3D visual data onto an 1D auditory space. The probabilistic component of the model enables the visual features to guide the grouping of the auditory features in order to form audiovisual (AV) objects. The proposed model and the associated algorithms are implemented in real-time (17 FPS) using a stereoscopic camera pair and two microphones embedded into the head of the humanoid robot NAO. We perform experiments with (i)~synthetic data, (ii)~publicly available data gathered with an audiovisual robotic head, and (iii)~data acquired using the NAO robot. The results validate the approach and are an encouragement to investigate how vision and hearing could be further combined for robust HRI
Integration of a voice recognition system in a social robot
Human-Robot Interaction (HRI) 1 is one of the main fields in the study and research of robotics. Within this field, dialog systems and interaction by voice play a very important role. When speaking about human- robot natural dialog we assume that the robot has the capability to accurately recognize the utterance what the human wants to transmit verbally and even its semantic meaning, but this is not always achieved. In this paper we describe the steps and requirements that we went through in order to endow the personal social robot Maggie, developed in the University Carlos III of Madrid, with the capability of understanding the natural language spoken by any human. We have analyzed the different possibilities offered by current software/hardware alternatives by testing them in real environments. We have obtained accurate data related to the speech recognition capabilities in different environments, using the most modern audio acquisition systems and analyzing not so typical parameters as user age, sex, intonation, volume and language. Finally we propose a new model to classify recognition results as accepted and rejected, based in a second ASR opinion. This new approach takes into account the pre-calculated success rate in noise intervals for each recognition framework decreasing false positives and false negatives rate.The funds have provided by the Spanish Government through the project called `Peer to Peer Robot-Human Interaction'' (R2H), of MEC (Ministry of Science and Education), and the project âA new approach to social robotics'' (AROS), of MICINN (Ministry of Science and Innovation). The research leading to these results has received funding from the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU
User localization during human-robot interaction
This paper presents a user localization system based on the fusion of visual information and sound source localization, implemented on a social robot called Maggie. One of the main requisites to obtain a natural interaction between human-human and human-robot is an adequate spatial situation between the interlocutors, that is, to be orientated and situated at the right distance during the conversation in order to have a satisfactory communicative process. Our social robot uses a complete multimodal dialog system which manages the user-robot interaction during the communicative process. One of its main components is the presented user localization system. To determine the most suitable allocation of the robot in relation to the user, a proxemic study of the human-robot interaction is required, which is described in this paper. The study has been made with two groups of users: children, aged between 8 and 17, and adults. Finally, at the end of the paper, experimental results with the proposed multimodal dialog system are presented.The authors gratefully acknowledge the funds provided by the Spanish Government through the project âA new approach to social roboticsâ (AROS), of MICINN (Ministry of Science and Innovation)
Recommended from our members
Attentional mechanisms for socially interactive robots â a survey
This review intends to provide an overview of the state of the art in the modeling and implementation of automatic attentional mechanisms for socially interactive robots. Humans assess and exhibit intentionality by resorting to multisensory processes that are deeply rooted within low-level automatic attention-related mechanisms of the brain. For robots to engage with humans properly, they should also be equipped with similar capabilities. Joint attention, the precursor of many fundamental types of social interactions, has been an important focus of research in the past decade and a half, therefore providing the perfect backdrop for assessing the current status of state-of-the-art automatic attentional-based solutions. Consequently, we propose to review the influence of these mechanisms in the context of social interaction in cutting-edge research work on joint attention. This will be achieved by summarizing the contributions already made in these matters in robotic cognitive systems research, by identifying the main scientific issues to be addressed by these contributions and analyzing how successful they have been in this respect, and by consequently drawing conclusions that may suggest a roadmap for future successful research efforts
SystÚme d'audition artificielle embarqué optimisé pour robot mobile muni d'une matrice de microphones
Dans un environnement non contrĂŽlĂ©, un robot doit pouvoir interagir avec les personnes dâune façon autonome. Cette autonomie doit Ă©galement inclure une interaction grĂące Ă la voix humaine. Lorsque lâinteraction sâeffectue Ă une distance de quelques mĂštres, des phĂ©nomĂšnes tels que la rĂ©verbĂ©ration et la prĂ©sence de bruit ambiant doivent ĂȘtre pris en considĂ©ration pour effectuer efficacement des tĂąches comme la reconnaissance de la parole ou de locuteur. En ce sens, le robot doit ĂȘtre en mesure de localiser, suivre et sĂ©parer les sources sonores prĂ©sentes dans son environnement.
Lâaugmentation rĂ©cente de la puissance de calcul des processeurs et la diminution de leur consommation Ă©nergĂ©tique permettent dorĂ©navant dâintĂ©grer ces systĂšmes dâaudition articielle sur des systĂšmes embarquĂ©s en temps rĂ©el. Lâaudition robotique est un domaine relativement jeune qui compte deux principales librairies dâaudition artificielle : ManyEars et HARK. JusquâĂ prĂ©sent, le nombre de microphones se limite gĂ©nĂ©ralement Ă huit, en raison de lâaugmentation rapide de charge de calculs lorsque des microphones supplĂ©mentaires sont ajoutĂ©s. De plus, il est parfois difficile dâutiliser ces librairies avec des robots possĂ©dant des gĂ©omĂ©tries variĂ©es puisquâil est nĂ©cessaire de les calibrer manuellement.
Cette thĂšse prĂ©sente la librairie ODAS qui apporte des solutions Ă ces difficultĂ©s. Afin dâeffectuer une localisation et une sĂ©paration plus robuste aux matrices de microphones fermĂ©es, ODAS introduit un modĂšle de directivitĂ© pour chaque microphone. Une recherche hiĂ©rarchique dans lâespace permet Ă©galement de rĂ©duire la quantitĂ© de calculs nĂ©cessaires. De plus, une mesure de lâincertitude du dĂ©lai dâarrivĂ©e du son est introduite pour ajuster automatiquement plusieurs paramĂštres et ainsi Ă©viter une calibration manuelle du systĂšme.
ODAS propose Ă©galement un nouveau module de suivi de sources sonores qui emploie des filtres de Kalman plutĂŽt que des filtres particulaires.
Les rĂ©sultats dĂ©montrent que les mĂ©thodes proposĂ©es rĂ©duisent la quantitĂ© de fausses dĂ©tections durant la localisation, amĂ©liorent la robustesse du suivi pour des sources sonores multiples et augmentent la qualitĂ© de la sĂ©paration de 2.7 dB dans le cas dâun formateur de faisceau Ă variance minimale. La quantitĂ© de calculs requis diminue par un facteur allant jusquâĂ 4 pour la localisation et jusquâĂ 30 pour le suivi par rapport Ă la librairie ManyEars. Le module de sĂ©paration des sources sonores exploite plus efficacement la gĂ©omĂ©trie de la matrice de microphones, sans quâil soit nĂ©cessaire de mesurer et calibrer manuellement le
systĂšme.
Avec les performances observĂ©es, la librairie ODAS ouvre aussi la porte Ă des applications dans le domaine de la dĂ©tection des drones par le bruit, la localisation de bruits extĂ©rieurs pour une navigation plus efficace pour les vĂ©hicules autonomes, des assistants main-libre Ă domicile et lâintĂ©gration dans des aides auditives
Architecture de contrÎle d'un robot de téléprésence et d'assistance aux soins à domicile
La population vieillissante provoque une croissance des coĂ»ts pour les soins hospitaliers. Pour Ă©viter que ces coĂ»ts deviennent trop importants, des robots de tĂ©lĂ©prĂ©sence et dâassistance aux soins et aux activitĂ©s quotidiennes sont envisageables afin de maintenir lâautonomie des personnes ĂągĂ©es Ă leur domicile. Cependant, les robots actuels possĂšdent individuellement des fonctionnalitĂ©s intĂ©ressantes, mais il serait bĂ©nĂ©fique de pouvoir rĂ©unir leurs capacitĂ©s. Une telle intĂ©gration est possible par lâutilisation dâune architecture dĂ©cisionnelle permettant de jumeler des capacitĂ©s de navigation, de suivi de la voix et dâacquisition dâinformations afin dâassister lâopĂ©rateur Ă distance, voir mĂȘme sây substituer.
Pour ce projet, lâarchitecture de contrĂŽle HBBA (Hybrid Behavior-Based Architecture) sert de pilier pour unifier les bibliothĂšques requises, RTAB-Map (Real-Time Appearance-Based Mapping) et ODAS (Open embeddeD Audition System), pour rĂ©aliser cette intĂ©gration. RTAB-Map est une bibliothĂšque permettant la localisation et la cartographie simultanĂ©e selon diffĂ©rentes configurations de capteurs tout en respectant les contraintes de traitement en ligne. ODAS est une bibliothĂšque permettant la localisation, le suivi et la sĂ©paration de sources sonores en milieux rĂ©els. Les objectifs sont dâĂ©valuer ces capacitĂ©s en environnement rĂ©el en dĂ©ployant la plateforme robotique dans diffĂ©rents domiciles, et dâĂ©valuer le potentiel dâune telle intĂ©gration en rĂ©alisant un scĂ©nario autonome dâassistance Ă la prise de mesure de signes vitaux.
La plateforme robotique Beam+ est utilisĂ©e pour rĂ©aliser cette intĂ©gration. La plateforme est bonifiĂ©e par lâajout dâune camĂ©ra RBG-D, dâune matrice de huit microphones, dâun ordinateur et de batteries supplĂ©mentaires. LâimplĂ©mentation rĂ©sultante, nommĂ©e SAM, a Ă©tĂ© Ă©valuĂ©e dans 10 domiciles pour caractĂ©riser la navigation et le suivi de conversation. Les rĂ©sultats de la navigation suggĂšrent que les capacitĂ©s de navigation fonctionnent selon certaines contraintes propres au positionement des capteurs et des conditions environnementales, impliquant la nĂ©cessitĂ© dâintervention de lâopĂ©rateur pour compenser. La modalitĂ© de suivi de la voix fonctionne bien dans des environnements calmes, mais des amĂ©liorations sont requises en milieu bruyant. Incidemment, la rĂ©alisation dâun scĂ©nario dâassistance complĂštement autonome est fonction des performances de la combinaison de ces fonctionnalitĂ©s, ce qui rend difficile dâenvisager le retrait complet dâun opĂ©rateur dans la boucle de dĂ©cision. LâintĂ©gration des modalitĂ©s avec HBBA sâavĂšre possible et concluante, et ouvre la porte Ă la rĂ©utilisabilitĂ© de lâimplĂ©mentation sur dâautres plateformes robotiques qui pourraient venir compenser face aux lacunes observĂ©es sur la mise en Ćuvre avec la plateforme Beam+
- âŠ