359 research outputs found

    CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

    Full text link
    In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation

    CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning

    Get PDF
    In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.Comment: Accepted at ICML 201

    Robust continuous prediction of human emotions using multiscale dynamic cues

    Get PDF
    Designing systems able to interact with humans in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the Audio/Visual Emotion Challenge (AVEC’12) whose goal is to continuously predict four affective signals describing human emotions (namely valence, arousal, expectancy and power). The proposed method uses log-magnitude Fourier spectra to extract multiscale dynamic descriptions of signals characterizing global and local face appearance as well as head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. For selecting features, we introduce a new correlation-based measure that takes into account a possible delay between the labels and the data and significantly increases robustness. We also propose a particularly fast regressor-level fusion framework to merge systems based on di↵erent modalities. Experiments have proven the e ciency of each key point of the proposed method and we obtain very promising results

    Impact d'un Robot " Majordome " sur l'état psychoaffectif et cognitif de personnes âgées ayant des troubles cognitifs

    No full text
    Les personnes âgées souffrant de troubles cognitifs ont besoin de services, en particulier d'aide sous forme d'entrainement cognitif et de facilitation des contacts sociaux, auxquelles les technologies de l'information et de la communication peuvent répondre. Les équipes (Broca, Valoria, ISIR, Robosoft) du projet TECSAN Robadom (financé par l'ANR) développent et testent auprès de personnes âgées souffrant de troubles cognitifs légers, un robot doté d'émotions et du langage, adapté aux difficultés de ces personnes, contrôlé par elles et qui pourrait contribuer à leur soutien au domicile en apportant différents services tels que des aides matérielles, des relais d'information, un soutien psychologique et cognitif. En début de projet, les scénarii de validations du robot et des spécifications techniques de l'interaction ont été réalisés. La deuxième phase comportait la conception du robot, le développement de la perception multimodale centrée sur l'utilisateur et du modèle émotionnel et cognitif d'interaction. La troisième phase est constituée des évaluations cliniques. Cette tâche permet d'étudier l'acceptabilité, l'utilisabilité et l'impact du robot dans la vie quotidienne (affectif, cognitif, qualité de vie...) et la manière dont le robot est perçu (compagnon, machine, intrus) par les utilisateurs, ainsi que les questions d'éthiques soulevées par le projet à travers une approche transversale prenant en compte aussi bien les dimensions normatives (lois, droits, etc.) que proactives (opinions des usagers)

    SLOT-V: Supervised Learning of Observer Models for Legible Robot Motion Planning in Manipulation

    Full text link
    We present SLOT-V, a novel supervised learning framework that learns observer models (human preferences) from robot motion trajectories in a legibility context. Legibility measures how easily a (human) observer can infer the robot's goal from a robot motion trajectory. When generating such trajectories, existing planners often rely on an observer model that estimates the quality of trajectory candidates. These observer models are frequently hand-crafted or, occasionally, learned from demonstrations. Here, we propose to learn them in a supervised manner using the same data format that is frequently used during the evaluation of aforementioned approaches. We then demonstrate the generality of SLOT-V using a Franka Emika in a simulated manipulation environment. For this, we show that it can learn to closely predict various hand-crafted observer models, i.e., that SLOT-V's hypothesis space encompasses existing handcrafted models. Next, we showcase SLOT-V's ability to generalize by showing that a trained model continues to perform well in environments with unseen goal configurations and/or goal counts. Finally, we benchmark SLOT-V's sample efficiency (and performance) against an existing IRL approach and show that SLOT-V learns better observer models with less data. Combined, these results suggest that SLOT-V can learn viable observer models. Better observer models imply more legible trajectories, which may - in turn - lead to better and more transparent human-robot interaction

    Legibot: Generating Legible Motions for Service Robots Using Cost-Based Local Planners

    Full text link
    With the increasing presence of social robots in various environments and applications, there is an increasing need for these robots to exhibit socially-compliant behaviors. Legible motion, characterized by the ability of a robot to clearly and quickly convey intentions and goals to the individuals in its vicinity, through its motion, holds significant importance in this context. This will improve the overall user experience and acceptance of robots in human environments. In this paper, we introduce a novel approach to incorporate legibility into local motion planning for mobile robots. This can enable robots to generate legible motions in real-time and dynamic environments. To demonstrate the effectiveness of our proposed methodology, we also provide a robotic stack designed for deploying legibility-aware motion planning in a social robot, by integrating perception and localization components

    Enhancing Agent Communication and Learning through Action and Language

    Full text link
    We introduce a novel category of GC-agents capable of functioning as both teachers and learners. Leveraging action-based demonstrations and language-based instructions, these agents enhance communication efficiency. We investigate the incorporation of pedagogy and pragmatism, essential elements in human communication and goal achievement, enhancing the agents' teaching and learning capabilities. Furthermore, we explore the impact of combining communication modes (action and language) on learning outcomes, highlighting the benefits of a multi-modal approach.Comment: IMOL workshop, Paris 202

    Le projet Robadom : conception d'un robot d'assistance pour les personnes âgées

    No full text
    National audienceContexte : Le projet ROBADOM a pour objectif de concevoir "un robot majordome", capable de fournirdes interactions verbales et non verbales et des feedbacks pour aider au quotidien les personnes âgées présentant des troubles cognitifs légers. Objectif : Le projet ROBADOM aborde les thématiques suivantes : 1. Le contexte social pour la conception de robots : 1) définir l'apparence du robot et 2) étudier les perceptions et les attitudes des personnes âgées à l'égard d'un robot d'assistance ; 2. Développer les comportements du robot pour créer une interaction "naturelle": 1) des solutions techniques pour un robot expressif, 2) la communication verbale et non verbale entre les personnes âgées et le robot ; 3. Etudier l'acceptabilité du robot chez les personnes âgées ; 4. Etudier l'impact du robot sur les utilisateurs âgés. Méthode : Les quatre études ont impliqué à la fois une méthode qualitative et une méthode expérimentale, réalisées au sein de notre laboratoire "LUSAGE". Résultats et conclusion : Les petits robots avec des traits stylisés ont été appréciés par les participants. Concernant les fonctionnalités, la stimulation cognitive, le rappel de tâches et la localisation d'objets ont été positivement évalués. Bien que les participants jugent le robot utile, ils n'étaient pas encore prêts à l'adopter. De plus, ils ont perçu certaines expressions du robot différemment des personnes jeunes. Ainsi, le système robotisé devra être adapté aux spécificités des personnes âgées. Enfin, nos participants ont soulevé la question de la valeur ajoutée d'un système robotisé par rapport à un ordinateur. Ainsi, de nombreux aspects (technologiques, interaction homme-robot, sociologiques...) restent à explorer avant d'évaluer l'impact du robot d'assistance au domicile
    corecore