40 research outputs found
A Particle Swarm Optimization inspired tracker applied to visual tracking
International audienceVisual tracking is dynamic optimization where time and object state simultaneously influence the problem. In this paper, we intend to show that we built a tracker from an evolutionary optimization approach, the PSO (Particle Swarm optimization) algorithm. We demonstrated that an extension of the original algorithm where system dynamics is explicitly taken into consideration, it can perform an efficient tracking. This tracker is also shown to outperform SIR (Sampling Importance Resampling) algorithm with random walk and constant velocity model, as well as a previously PSO inspired tracker, SPSO (Sequential Particle Swarm Optimization). Experiments were performed both on simulated data and real visual RGB-D information. Our PSO inspired tracker can be a very effective and robust alternative for visual tracking
Automatic intelligibility measures applied to speech signals simulating age-related hearing loss
International audienceThis research work forms the first part of a long-term project designed to provide a framework for facilitating hearing aids tuning. The present study focuses on the setting up of automatic measures of speech intelligibility for the recognition of isolated words and sentences. Both materials were degraded in order to simulate presbycusis effects on speech perception. Automatic measures based on an Automatic Speech Recognition (ASR) system were applied to an audio corpus simulating the effects of presbycusis at nine severity stages. The results are compared to reference intelligibility scores collected from 60 French listeners. The aim of this system being to produce measures as close as possible to human behaviour, good performances were achieved since strong correlations between subjective and objective scores are observed
A Multi-modal Perception based Architecture for a Non-intrusive Domestic Assistant Robot
International audienceWe present a multi-modal perception based architecture to realize a non-intrusive domestic assistant robot. The realized robot is non-intrusive in that it only starts interaction with a user when it detects the user's intention to do so automatically. All the robot's actions are based on multi-modal perceptions, which include: user detection based on RGB-D data, user's intention-for-interaction detection with RGB-D and audio data, and communication via speech recognition. The utilization of multi-modal cues in different parts of the robotic activity paves the way to successful robotic runs
Etude de l'IHR sur deux groupes de personnes agées
International audienceWe used PR2 robot, in autonomous operation in a living lab setting, to provide an object search service to elderly volunteers (familiar to robots or naïve). Observation was complemented by semi-directed interviews. There was no significant difference between the groups either in the successful detection of the willingness to interact or the appreciation of voice interaction. This fosters dedicated HCI development for the elderly.Nous avons étudié l'interaction homme-robot, en fonctionnement autonome en environnement contrÎlé, de PR2⹠utilisé pour rechercher des objets avec des sujets ùgés (avec ou sans expertise robotique). L'observation a été complétée d'entretiens semi-directifs. Il n'y a pas eu de différence significative entre ces deux groupes pour le succÚs de la détection d'intentionnalité et la perception de l'interaction vocale. Ce résultat est en faveur d'IHM prenant en compte les spécificités de la personne ùgée
Perceiving user's intention-for-interaction: A probabilistic multimodal data fusion scheme
International audienceUnderstanding people's intention, be it action or thought, plays a fundamental role in establishing coherent communication amongst people, especially in non-proactive robotics, where the robot has to understand explicitly when to start an interaction in a natural way. In this work, a novel approach is presented to detect people's intention-for-interaction. The proposed detector fuses multimodal cues, including estimated head pose, shoulder orientation and vocal activity detection, using a probabilistic discrete state Hidden Markov Model. The multimodal detector achieves up to 80% correct detection rates improving purely audio and RGB-D based variants
Reward-Based Environment States for Robot Manipulation Policy Learning
Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success
classifier. Our experimentsâusing the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift taskâreveal that our proposed state representation can achieve up to 97% task success using our best policies
Blip10000: a social video dataset containing SPUG content for tagging and retrieval
The increasing amount of digital multimedia content available is inspiring potential new types of user interaction with video data. Users want to easilyfind the content by searching and browsing. For this reason, techniques are needed that allow automatic categorisation, searching the content and linking to related information.
In this work, we present a dataset that contains comprehensive semi-professional user generated (SPUG) content, including audiovisual content, user-contributed metadata, automatic speech recognition transcripts, automatic shot boundary les, and social information for multiple `social levels'. We describe the principal characteristics of this dataset and present results that have been achieved on different tasks
Quels sont les objets égarés à domicile par les personnes ùgées fragiles ? Une étude pilote sur 60 personnes
National audienceLoosing objects is a cause of conflicts between frail elderlies and their caregivers. To our knowledge, the literature addressing delusion of theft doesnât provide information on the objects that are involved. In the RIDDLE project, we are using a companion robot to help the elderly find the objects they are looking for. Hence, we initiated a study with the cross interviews of 60 patient/caregiver dyads to identify which objects would be most relevant to them. Objects are looked for by the patient according to 72 % of the patients and 82 % of the caregivers. The most commonly looked for objects, when they are in use by the patient, are: spectacles (45 %), house keys (34 %), mobile (31 %), wallet (26 %), remote control (19 %), and cane (22 %). After rigging the localization technology to the afore-mentioned objects, the related service will have to be customized to the ways of the user.La perte dâobjets cause des conflits entre les personnes ĂągĂ©es fragiles et leur famille. Le projet Riddle utilise un robot compagnon pour aider des personnes ĂągĂ©es Ă retrouver des objets. La bibliographie sur le dĂ©lire de vol ne donne pas de liste dâobjets recherchĂ©s. Lâobjectif est de dĂ©finir les objets les plus pertinents Ă localiser en rĂ©alisant un interrogatoire croisĂ©, sĂ©parĂ©ment, de 60 couples patient/aidant. Soixante-douze pour cent des patients recherchent des objets (82 % pour les aidants). Les objets utilisĂ©s les plus recherchĂ©s sont : lunettes (45 %), clĂ©s de maison (34 %), tĂ©lĂ©phone portable (31 %), porte-monnaie (26 %), tĂ©lĂ©commande (19 %), canne (22 %). AprĂšs Ă©quipement technique des objets ainsi dĂ©finis, la mise en Ćuvre du service dâaide devra tenir compte de lâusage individuel
A multi-modal perception based assistive robotic system for the elderly
Edited by Giovanni Maria Farinella, Takeo Kanade, Marco Leo, Gerard G. Medioni, Mohan TrivediInternational audienceIn this paper, we present a multi-modal perception based framework to realize a non-intrusive domestic assistive robotic system. It is non-intrusive in that it only starts interaction with a user when it detects the user's intention to do so. All the robot's actions are based on multi-modal perceptions which include user detection based on RGB-D data, user's intention-for-interaction detection with RGB-D and audio data, and communication via user distance mediated speech recognition. The utilization of multi-modal cues in different parts of the robotic activity paves the way to successful robotic runs (94% success rate). Each presented perceptual component is systematically evaluated using appropriate dataset and evaluation metrics. Finally the complete system is fully integrated on the PR2 robotic platform and validated through system sanity check runs and user studies with the help of 17 volunteer elderly participants
Comparaison de mesures perceptives et automatiques de l'intelligibilité : application à de la parole simulant la presbyacousie
International audienceCet article présente une étude comparative entre mesures perceptives et mesures automatiques de l'intelligibilité de la parole sur de la parole dégradée par une simulation de la presbyacousie. L'objectif est de répondre à la question : peut-on se rapprocher d'une mesure perceptive humaine en utilisant un systÚme de reconnaissance automatique de la parole ? Pour ce faire, un corpus de parole dégradée a été spécifiquement constitué puis utilisé pour des tests perceptifs et enfin soumis à un traitement automatique. De fortes corrélations entre les performances humaines et les scores de reconnaissance automatique sont observées