Search CORE

2,260 research outputs found

A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

Author: Mavridis Nikolaos
Publication venue
Publication date: 20/01/2014
Field of study

In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Integration of a voice recognition system in a social robot

Author: Alonso Martín Fernando
Salichs Sánchez-Caballero Miguel
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2011
Field of study

Human-Robot Interaction (HRI) 1 is one of the main fields in the study and research of robotics. Within this field, dialog systems and interaction by voice play a very important role. When speaking about human- robot natural dialog we assume that the robot has the capability to accurately recognize the utterance what the human wants to transmit verbally and even its semantic meaning, but this is not always achieved. In this paper we describe the steps and requirements that we went through in order to endow the personal social robot Maggie, developed in the University Carlos III of Madrid, with the capability of understanding the natural language spoken by any human. We have analyzed the different possibilities offered by current software/hardware alternatives by testing them in real environments. We have obtained accurate data related to the speech recognition capabilities in different environments, using the most modern audio acquisition systems and analyzing not so typical parameters as user age, sex, intonation, volume and language. Finally we propose a new model to classify recognition results as accepted and rejected, based in a second ASR opinion. This new approach takes into account the pre-calculated success rate in noise intervals for each recognition framework decreasing false positives and false negatives rate.The funds have provided by the Spanish Government through the project called `Peer to Peer Robot-Human Interaction'' (R2H), of MEC (Ministry of Science and Education), and the project “A new approach to social robotics'' (AROS), of MICINN (Ministry of Science and Innovation). The research leading to these results has received funding from the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Computational Audiovisual Scene Analysis

Author: Yan Rujiao
Publication venue: Universitätsbibliothek Bielefeld
Publication date: 01/01/2014
Field of study

Yan R. Computational Audiovisual Scene Analysis. Bielefeld: Universitätsbibliothek Bielefeld; 2014.In most real-world situations, a robot is interacting with multiple people. In this case, understanding of the dialogs is essential. However, dialog scene analysis is missing in most existing systems of human-robot interaction. In such systems, only one speaker can talk with the robot or each speaker wears an attached microphone or a headset. The target of Computational AudioVisual Scene Analysis (CAVSA) is therefore making dialogs between humans and robots more natural and flexible. The CAVSA system is able to learn how many speakers are in the scenario, where the speakers are and who is currently speaking. CAVSA is a challenging task due to the complexity of dialogue scenarios. First, speakers are unknown in advance, thus a database for training high-level features beforehand to recognize faces or voices is not available. Second, people can dynamically come into and leave the scene, may move all the time and even change their locations outside the camera field of view. Third, the robot can not see all the people at the same time due to limited camera field of view and head movements. Moreover, a sound could be related to a person who stands outside the camera field of view and has never been seen. I will show that the CAVSA system is able to assign words to corresponding speakers. A speaker is recognized again when he leaves and enters the scene, or changes his position even with a newly appearing person

Publications at Bielefeld University

Architecture de contrôle d'un robot de téléprésence et d'assistance aux soins à domicile

Author: Laniel Sébastien
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2019
Field of study

La population vieillissante provoque une croissance des coûts pour les soins hospitaliers. Pour éviter que ces coûts deviennent trop importants, des robots de téléprésence et d’assistance aux soins et aux activités quotidiennes sont envisageables afin de maintenir l’autonomie des personnes âgées à leur domicile. Cependant, les robots actuels possèdent individuellement des fonctionnalités intéressantes, mais il serait bénéfique de pouvoir réunir leurs capacités. Une telle intégration est possible par l’utilisation d’une architecture décisionnelle permettant de jumeler des capacités de navigation, de suivi de la voix et d’acquisition d’informations afin d’assister l’opérateur à distance, voir même s’y substituer. Pour ce projet, l’architecture de contrôle HBBA (Hybrid Behavior-Based Architecture) sert de pilier pour unifier les bibliothèques requises, RTAB-Map (Real-Time Appearance-Based Mapping) et ODAS (Open embeddeD Audition System), pour réaliser cette intégration. RTAB-Map est une bibliothèque permettant la localisation et la cartographie simultanée selon différentes configurations de capteurs tout en respectant les contraintes de traitement en ligne. ODAS est une bibliothèque permettant la localisation, le suivi et la séparation de sources sonores en milieux réels. Les objectifs sont d’évaluer ces capacités en environnement réel en déployant la plateforme robotique dans différents domiciles, et d’évaluer le potentiel d’une telle intégration en réalisant un scénario autonome d’assistance à la prise de mesure de signes vitaux. La plateforme robotique Beam+ est utilisée pour réaliser cette intégration. La plateforme est bonifiée par l’ajout d’une caméra RBG-D, d’une matrice de huit microphones, d’un ordinateur et de batteries supplémentaires. L’implémentation résultante, nommée SAM, a été évaluée dans 10 domiciles pour caractériser la navigation et le suivi de conversation. Les résultats de la navigation suggèrent que les capacités de navigation fonctionnent selon certaines contraintes propres au positionement des capteurs et des conditions environnementales, impliquant la nécessité d’intervention de l’opérateur pour compenser. La modalité de suivi de la voix fonctionne bien dans des environnements calmes, mais des améliorations sont requises en milieu bruyant. Incidemment, la réalisation d’un scénario d’assistance complètement autonome est fonction des performances de la combinaison de ces fonctionnalités, ce qui rend difficile d’envisager le retrait complet d’un opérateur dans la boucle de décision. L’intégration des modalités avec HBBA s’avère possible et concluante, et ouvre la porte à la réutilisabilité de l’implémentation sur d’autres plateformes robotiques qui pourraient venir compenser face aux lacunes observées sur la mise en œuvre avec la plateforme Beam+

Savoirs UdeS

A system for recognizing human emotions based on speech analysis and facial feature extraction: applications to Human-Robot Interaction

Author: Rabiei Mohammad
Publication venue: place:Udine
Publication date: 08/04/2015
Field of study

With the advance in Artificial Intelligence, humanoid robots start to interact with ordinary people based on the growing understanding of psychological processes. Accumulating evidences in Human Robot Interaction (HRI) suggest that researches are focusing on making an emotional communication between human and robot for creating a social perception, cognition, desired interaction and sensation. Furthermore, robots need to receive human emotion and optimize their behavior to help and interact with a human being in various environments. The most natural way to recognize basic emotions is extracting sets of features from human speech, facial expression and body gesture. A system for recognition of emotions based on speech analysis and facial features extraction can have interesting applications in Human-Robot Interaction. Thus, the Human-Robot Interaction ontology explains how the knowledge of these fundamental sciences is applied in physics (sound analyses), mathematics (face detection and perception), philosophy theory (behavior) and robotic science context. In this project, we carry out a study to recognize basic emotions (sadness, surprise, happiness, anger, fear and disgust). Also, we propose a methodology and a software program for classification of emotions based on speech analysis and facial features extraction. The speech analysis phase attempted to investigate the appropriateness of using acoustic (pitch value, pitch peak, pitch range, intensity and formant), phonetic (speech rate) properties of emotive speech with the freeware program PRAAT, and consists of generating and analyzing a graph of speech signals. The proposed architecture investigated the appropriateness of analyzing emotive speech with the minimal use of signal processing algorithms. 30 participants to the experiment had to repeat five sentences in English (with durations typically between 0.40 s and 2.5 s) in order to extract data relative to pitch (value, range and peak) and rising-falling intonation. Pitch alignments (peak, value and range) have been evaluated and the results have been compared with intensity and speech rate. The facial feature extraction phase uses the mathematical formulation (B\ue9zier curves) and the geometric analysis of the facial image, based on measurements of a set of Action Units (AUs) for classifying the emotion. The proposed technique consists of three steps: (i) detecting the facial region within the image, (ii) extracting and classifying the facial features, (iii) recognizing the emotion. Then, the new data have been merged with reference data in order to recognize the basic emotion. Finally, we combined the two proposed algorithms (speech analysis and facial expression), in order to design a hybrid technique for emotion recognition. Such technique have been implemented in a software program, which can be employed in Human-Robot Interaction. The efficiency of the methodology was evaluated by experimental tests on 30 individuals (15 female and 15 male, 20 to 48 years old) form different ethnic groups, namely: (i) Ten adult European, (ii) Ten Asian (Middle East) adult and (iii) Ten adult American. Eventually, the proposed technique made possible to recognize the basic emotion in most of the cases

Archivio istituzionale della ricerca - Università degli Studi di Udine

FPGA-based architectures for acoustic beamforming with microphone arrays : trends, challenges and research opportunities

Author: Braeken An
da Silva Gomes Bruno
Touhafi Abdellah
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Over the past decades, many systems composed of arrays of microphones have been developed to satisfy the quality demanded by acoustic applications. Such microphone arrays are sound acquisition systems composed of multiple microphones used to sample the sound field with spatial diversity. The relatively recent adoption of Field-Programmable Gate Arrays (FPGAs) to manage the audio data samples and to perform the signal processing operations such as filtering or beamforming has lead to customizable architectures able to satisfy the most demanding computational, power or performance acoustic applications. The presented work provides an overview of the current FPGA-based architectures and how FPGAs are exploited for different acoustic applications. Current trends on the use of this technology, pending challenges and open research opportunities on the use of FPGAs for acoustic applications using microphone arrays are presented and discussed

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Directory of Open Access Journals

MARIE, une architecture d'intégration de composants logiciels hétérogènes pour le développement de systèmes décisionnels en robotique mobile et autonome

Author: Côté Carle
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2011
Field of study

""Aujourd'hui, la création de systèmes décisionnels en robotique mobile et autonome requiert l'intégration de nombreuses capacités motrices, sensorielles et cognitives au sein de chacun des projets réalisés. Ces capacités sont généralement issues de différents domaines de recherche, comme par exemple la navigation autonome, la planification, les interactions humain-machine, la localisation, la vision artificielle et le contrôle d'actionneurs, pour ne nommer que ceux-ci. D'un point de vue logiciel, deux défis de taille sont issus de ce besoin d'intégration : 1) la complexification de l'analyse des requis pour choisir, construire et interconnecter les différents composants logiciels qui permettent la réalisation de ces capacités, et 2) l'interconnectivité limitée des composants logiciels disponibles dans la communauté robotique causée par le fait qu'ils sont typiquement hétérogènes, c'est-à-dire qu'ils ne sont pas complètement compatibles ou interopérables. Cette thèse propose une solution principalement au défi d'interconnectivité limité en se basant sur la création d'une architecture d'intégration logicielle appelée MARIE, qui permet d'intégrer des composants logiciels hétérogènes utilisant une approche de prototypage rapide pour le développement de systèmes décisionnels en robotique mobile et autonome. Grâce à cette approche, la réalisation de systèmes décisionnels complets pourrait se faire plus tôt dans le cycle de développement, et ainsi favoriser l'analyse des requis nécessaires à l'intégration de chacun des composants logiciels du système. Les résultats montrent que grâce au développement de l'architecture d'intégration logicielle MARIE, plus de 15 composants logiciels provenant de sources indépendantes ont été intégrées au sein de plusieurs applications robotiques (réelles et simulées), afin de réaliser leurs systèmes décisionnels respectifs. L'adaptation des composants déjà existants dans la communauté robotique a permis notamment d'éviter la tâche souvent ardue de réécrire le code nécessaire pour chacun des composants dans un seul et même environnement de développement. Les résultats montrent également que grâce à une méthodologie d'évaluation logicielle appelée ARID, nous avons pu recueillir de l'information utile et pertinente à propos des risques associés à l'utilisation de MARIE pour réaliser une application choisie, sans devoir construire une application de test et sans avoir recours à de la documentation complète de l'architecture logicielle ni celle de l'application à créer. Cette méthode s'inscrit ainsi dans la liste des outils qui permettent de faciliter l'analyse des requis d'intégration reliés à la création de systèmes décisionnels en robotique mobile et autonome."

Savoirs UdeS