1,736 research outputs found

    Visual Imitation Learning with Recurrent Siamese Networks

    Full text link
    It would be desirable for a reinforcement learning (RL) based agent to learn behaviour by merely watching a demonstration. However, defining rewards that facilitate this goal within the RL paradigm remains a challenge. Here we address this problem with Siamese networks, trained to compute distances between observed behaviours and the agent's behaviours. Given a desired motion such Siamese networks can be used to provide a reward signal to an RL agent via the distance between the desired motion and the agent's motion. We experiment with an RNN-based comparator model that can compute distances in space and time between motion clips while training an RL policy to minimize this distance. Through experimentation, we have had also found that the inclusion of multi-task data and an additional image encoding loss helps enforce the temporal consistency. These two components appear to balance reward for matching a specific instance of behaviour versus that behaviour in general. Furthermore, we focus here on a particularly challenging form of this problem where only a single demonstration is provided for a given task -- the one-shot learning setting. We demonstrate our approach on humanoid agents in both 2D with 1010 degrees of freedom (DoF) and 3D with 3838 DoF.Comment: PrePrin

    09341 Abstracts Collection -- Cognition, Control and Learning for Robot Manipulation in Human Environments

    Get PDF
    From 16.08. to 21.08.2009, the Dagstuhl Seminar 09341 ``Cognition, Control and Learning for Robot Manipulation in Human Environments \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Learning Dynamic Robot-to-Human Object Handover from Human Feedback

    Full text link
    Object handover is a basic, but essential capability for robots interacting with humans in many applications, e.g., caring for the elderly and assisting workers in manufacturing workshops. It appears deceptively simple, as humans perform object handover almost flawlessly. The success of humans, however, belies the complexity of object handover as collaborative physical interaction between two agents with limited communication. This paper presents a learning algorithm for dynamic object handover, for example, when a robot hands over water bottles to marathon runners passing by the water station. We formulate the problem as contextual policy search, in which the robot learns object handover by interacting with the human. A key challenge here is to learn the latent reward of the handover task under noisy human feedback. Preliminary experiments show that the robot learns to hand over a water bottle naturally and that it adapts to the dynamics of human motion. One challenge for the future is to combine the model-free learning algorithm with a model-based planning approach and enable the robot to adapt over human preferences and object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics Research (ISRR) 201

    Towards a framework to make robots learn to dance

    Get PDF
    A key motive of human-robot interaction is to make robots and humans interact through different aspects of the real world. As robots become more and more realistic in appearance, so has the desire for them to exhibit complex behaviours. A growing area of interest in terms of complex behaviour is robot dancing. Dance is an entertaining activity that is enjoyed either by being the performer or the spectator. Each dance contain fundamental features that make-up a dance. It is the curiosity for some researchers to model such an activity for robots to perform in human social environments. From current research, most dancing robots are pre-programmed with dance motions and few have the ability to generate their own dance or alter their movements according to human responses while dancing. This thesis explores the question Can a robot learn to dance? . A dancing framework is proposed to address this question. The Sarsa algorithm and the Softmax algorithm from traditional reinforcement learning form part of the dancing framework to enable a virtual robot learn and adapt to appropriate dance behaviours. The robot follows a progressive approach, utilising the knowledge obtained at each stage of its development to improve the dances that it generates. The proposed framework addresses three stages of development of a robot s dance: learning ability; creative ability of dance motions, and adaptive ability to human preferences. Learning ability is the ability to make a robot gradually perform the desired dance behaviours. Creative ability is the idea of the robot generating its own dance motions, and structuring them into a dance. Adaptive ability is where the robot changes its dance in response to human feedback. A number of experiments have been conducted to explore these challenges, and verified that the quality of the robot dance can be improved through each stage of the robot s development
    • …
    corecore