207 research outputs found

    A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

    Get PDF
    In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

    Audio-Motor Integration for Robot Audition

    Get PDF
    International audienceIn the context of robotics, audio signal processing in the wild amounts to dealing with sounds recorded by a system that moves and whose actuators produce noise. This creates additional challenges in sound source localization, signal enhancement and recognition. But the speci-ficity of such platforms also brings interesting opportunities: can information about the robot actuators' states be meaningfully integrated in the audio processing pipeline to improve performance and efficiency? While robot audition grew to become an established field, methods that explicitly use motor-state information as a complementary modality to audio are scarcer. This chapter proposes a unified view of this endeavour, referred to as audio-motor integration. A literature review and two learning-based methods for audio-motor integration in robot audition are presented, with application to single-microphone sound source localization and ego-noise reduction on real data

    Vision-Guided Robot Hearing

    Get PDF
    International audienceNatural human-robot interaction (HRI) in complex and unpredictable environments is important with many potential applicatons. While vision-based HRI has been thoroughly investigated, robot hearing and audio-based HRI are emerging research topics in robotics. In typical real-world scenarios, humans are at some distance from the robot and hence the sensory (microphone) data are strongly impaired by background noise, reverberations and competing auditory sources. In this context, the detection and localization of speakers plays a key role that enables several tasks, such as improving the signal-to-noise ratio for speech recognition, speaker recognition, speaker tracking, etc. In this paper we address the problem of how to detect and localize people that are both seen and heard. We introduce a hybrid deterministic/probabilistic model. The deterministic component allows us to map 3D visual data onto an 1D auditory space. The probabilistic component of the model enables the visual features to guide the grouping of the auditory features in order to form audiovisual (AV) objects. The proposed model and the associated algorithms are implemented in real-time (17 FPS) using a stereoscopic camera pair and two microphones embedded into the head of the humanoid robot NAO. We perform experiments with (i)~synthetic data, (ii)~publicly available data gathered with an audiovisual robotic head, and (iii)~data acquired using the NAO robot. The results validate the approach and are an encouragement to investigate how vision and hearing could be further combined for robust HRI

    Integration of a voice recognition system in a social robot

    Get PDF
    Human-Robot Interaction (HRI) 1 is one of the main fields in the study and research of robotics. Within this field, dialog systems and interaction by voice play a very important role. When speaking about human- robot natural dialog we assume that the robot has the capability to accurately recognize the utterance what the human wants to transmit verbally and even its semantic meaning, but this is not always achieved. In this paper we describe the steps and requirements that we went through in order to endow the personal social robot Maggie, developed in the University Carlos III of Madrid, with the capability of understanding the natural language spoken by any human. We have analyzed the different possibilities offered by current software/hardware alternatives by testing them in real environments. We have obtained accurate data related to the speech recognition capabilities in different environments, using the most modern audio acquisition systems and analyzing not so typical parameters as user age, sex, intonation, volume and language. Finally we propose a new model to classify recognition results as accepted and rejected, based in a second ASR opinion. This new approach takes into account the pre-calculated success rate in noise intervals for each recognition framework decreasing false positives and false negatives rate.The funds have provided by the Spanish Government through the project called `Peer to Peer Robot-Human Interaction'' (R2H), of MEC (Ministry of Science and Education), and the project “A new approach to social robotics'' (AROS), of MICINN (Ministry of Science and Innovation). The research leading to these results has received funding from the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    User localization during human-robot interaction

    Get PDF
    This paper presents a user localization system based on the fusion of visual information and sound source localization, implemented on a social robot called Maggie. One of the main requisites to obtain a natural interaction between human-human and human-robot is an adequate spatial situation between the interlocutors, that is, to be orientated and situated at the right distance during the conversation in order to have a satisfactory communicative process. Our social robot uses a complete multimodal dialog system which manages the user-robot interaction during the communicative process. One of its main components is the presented user localization system. To determine the most suitable allocation of the robot in relation to the user, a proxemic study of the human-robot interaction is required, which is described in this paper. The study has been made with two groups of users: children, aged between 8 and 17, and adults. Finally, at the end of the paper, experimental results with the proposed multimodal dialog system are presented.The authors gratefully acknowledge the funds provided by the Spanish Government through the project “A new approach to social robotics” (AROS), of MICINN (Ministry of Science and Innovation)

    SystÚme d'audition artificielle embarqué optimisé pour robot mobile muni d'une matrice de microphones

    Get PDF
    Dans un environnement non contrĂŽlĂ©, un robot doit pouvoir interagir avec les personnes d’une façon autonome. Cette autonomie doit Ă©galement inclure une interaction grĂące Ă  la voix humaine. Lorsque l’interaction s’effectue Ă  une distance de quelques mĂštres, des phĂ©nomĂšnes tels que la rĂ©verbĂ©ration et la prĂ©sence de bruit ambiant doivent ĂȘtre pris en considĂ©ration pour effectuer efficacement des tĂąches comme la reconnaissance de la parole ou de locuteur. En ce sens, le robot doit ĂȘtre en mesure de localiser, suivre et sĂ©parer les sources sonores prĂ©sentes dans son environnement. L’augmentation rĂ©cente de la puissance de calcul des processeurs et la diminution de leur consommation Ă©nergĂ©tique permettent dorĂ©navant d’intĂ©grer ces systĂšmes d’audition articielle sur des systĂšmes embarquĂ©s en temps rĂ©el. L’audition robotique est un domaine relativement jeune qui compte deux principales librairies d’audition artificielle : ManyEars et HARK. Jusqu’à prĂ©sent, le nombre de microphones se limite gĂ©nĂ©ralement Ă  huit, en raison de l’augmentation rapide de charge de calculs lorsque des microphones supplĂ©mentaires sont ajoutĂ©s. De plus, il est parfois difficile d’utiliser ces librairies avec des robots possĂ©dant des gĂ©omĂ©tries variĂ©es puisqu’il est nĂ©cessaire de les calibrer manuellement. Cette thĂšse prĂ©sente la librairie ODAS qui apporte des solutions Ă  ces difficultĂ©s. Afin d’effectuer une localisation et une sĂ©paration plus robuste aux matrices de microphones fermĂ©es, ODAS introduit un modĂšle de directivitĂ© pour chaque microphone. Une recherche hiĂ©rarchique dans l’espace permet Ă©galement de rĂ©duire la quantitĂ© de calculs nĂ©cessaires. De plus, une mesure de l’incertitude du dĂ©lai d’arrivĂ©e du son est introduite pour ajuster automatiquement plusieurs paramĂštres et ainsi Ă©viter une calibration manuelle du systĂšme. ODAS propose Ă©galement un nouveau module de suivi de sources sonores qui emploie des filtres de Kalman plutĂŽt que des filtres particulaires. Les rĂ©sultats dĂ©montrent que les mĂ©thodes proposĂ©es rĂ©duisent la quantitĂ© de fausses dĂ©tections durant la localisation, amĂ©liorent la robustesse du suivi pour des sources sonores multiples et augmentent la qualitĂ© de la sĂ©paration de 2.7 dB dans le cas d’un formateur de faisceau Ă  variance minimale. La quantitĂ© de calculs requis diminue par un facteur allant jusqu’à 4 pour la localisation et jusqu’à 30 pour le suivi par rapport Ă  la librairie ManyEars. Le module de sĂ©paration des sources sonores exploite plus efficacement la gĂ©omĂ©trie de la matrice de microphones, sans qu’il soit nĂ©cessaire de mesurer et calibrer manuellement le systĂšme. Avec les performances observĂ©es, la librairie ODAS ouvre aussi la porte Ă  des applications dans le domaine de la dĂ©tection des drones par le bruit, la localisation de bruits extĂ©rieurs pour une navigation plus efficace pour les vĂ©hicules autonomes, des assistants main-libre Ă  domicile et l’intĂ©gration dans des aides auditives

    Architecture de contrÎle d'un robot de téléprésence et d'assistance aux soins à domicile

    Get PDF
    La population vieillissante provoque une croissance des coĂ»ts pour les soins hospitaliers. Pour Ă©viter que ces coĂ»ts deviennent trop importants, des robots de tĂ©lĂ©prĂ©sence et d’assistance aux soins et aux activitĂ©s quotidiennes sont envisageables afin de maintenir l’autonomie des personnes ĂągĂ©es Ă  leur domicile. Cependant, les robots actuels possĂšdent individuellement des fonctionnalitĂ©s intĂ©ressantes, mais il serait bĂ©nĂ©fique de pouvoir rĂ©unir leurs capacitĂ©s. Une telle intĂ©gration est possible par l’utilisation d’une architecture dĂ©cisionnelle permettant de jumeler des capacitĂ©s de navigation, de suivi de la voix et d’acquisition d’informations afin d’assister l’opĂ©rateur Ă  distance, voir mĂȘme s’y substituer. Pour ce projet, l’architecture de contrĂŽle HBBA (Hybrid Behavior-Based Architecture) sert de pilier pour unifier les bibliothĂšques requises, RTAB-Map (Real-Time Appearance-Based Mapping) et ODAS (Open embeddeD Audition System), pour rĂ©aliser cette intĂ©gration. RTAB-Map est une bibliothĂšque permettant la localisation et la cartographie simultanĂ©e selon diffĂ©rentes configurations de capteurs tout en respectant les contraintes de traitement en ligne. ODAS est une bibliothĂšque permettant la localisation, le suivi et la sĂ©paration de sources sonores en milieux rĂ©els. Les objectifs sont d’évaluer ces capacitĂ©s en environnement rĂ©el en dĂ©ployant la plateforme robotique dans diffĂ©rents domiciles, et d’évaluer le potentiel d’une telle intĂ©gration en rĂ©alisant un scĂ©nario autonome d’assistance Ă  la prise de mesure de signes vitaux. La plateforme robotique Beam+ est utilisĂ©e pour rĂ©aliser cette intĂ©gration. La plateforme est bonifiĂ©e par l’ajout d’une camĂ©ra RBG-D, d’une matrice de huit microphones, d’un ordinateur et de batteries supplĂ©mentaires. L’implĂ©mentation rĂ©sultante, nommĂ©e SAM, a Ă©tĂ© Ă©valuĂ©e dans 10 domiciles pour caractĂ©riser la navigation et le suivi de conversation. Les rĂ©sultats de la navigation suggĂšrent que les capacitĂ©s de navigation fonctionnent selon certaines contraintes propres au positionement des capteurs et des conditions environnementales, impliquant la nĂ©cessitĂ© d’intervention de l’opĂ©rateur pour compenser. La modalitĂ© de suivi de la voix fonctionne bien dans des environnements calmes, mais des amĂ©liorations sont requises en milieu bruyant. Incidemment, la rĂ©alisation d’un scĂ©nario d’assistance complĂštement autonome est fonction des performances de la combinaison de ces fonctionnalitĂ©s, ce qui rend difficile d’envisager le retrait complet d’un opĂ©rateur dans la boucle de dĂ©cision. L’intĂ©gration des modalitĂ©s avec HBBA s’avĂšre possible et concluante, et ouvre la porte Ă  la rĂ©utilisabilitĂ© de l’implĂ©mentation sur d’autres plateformes robotiques qui pourraient venir compenser face aux lacunes observĂ©es sur la mise en Ɠuvre avec la plateforme Beam+
    • 

    corecore