1,736 research outputs found
Visual Imitation Learning with Recurrent Siamese Networks
It would be desirable for a reinforcement learning (RL) based agent to learn
behaviour by merely watching a demonstration. However, defining rewards that
facilitate this goal within the RL paradigm remains a challenge. Here we
address this problem with Siamese networks, trained to compute distances
between observed behaviours and the agent's behaviours. Given a desired motion
such Siamese networks can be used to provide a reward signal to an RL agent via
the distance between the desired motion and the agent's motion. We experiment
with an RNN-based comparator model that can compute distances in space and time
between motion clips while training an RL policy to minimize this distance.
Through experimentation, we have had also found that the inclusion of
multi-task data and an additional image encoding loss helps enforce the
temporal consistency. These two components appear to balance reward for
matching a specific instance of behaviour versus that behaviour in general.
Furthermore, we focus here on a particularly challenging form of this problem
where only a single demonstration is provided for a given task -- the one-shot
learning setting. We demonstrate our approach on humanoid agents in both 2D
with degrees of freedom (DoF) and 3D with DoF.Comment: PrePrin
09341 Abstracts Collection -- Cognition, Control and Learning for Robot Manipulation in Human Environments
From 16.08. to 21.08.2009, the Dagstuhl Seminar 09341 ``Cognition, Control and Learning for Robot Manipulation in Human Environments \u27\u27 was held
in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Learning Dynamic Robot-to-Human Object Handover from Human Feedback
Object handover is a basic, but essential capability for robots interacting
with humans in many applications, e.g., caring for the elderly and assisting
workers in manufacturing workshops. It appears deceptively simple, as humans
perform object handover almost flawlessly. The success of humans, however,
belies the complexity of object handover as collaborative physical interaction
between two agents with limited communication. This paper presents a learning
algorithm for dynamic object handover, for example, when a robot hands over
water bottles to marathon runners passing by the water station. We formulate
the problem as contextual policy search, in which the robot learns object
handover by interacting with the human. A key challenge here is to learn the
latent reward of the handover task under noisy human feedback. Preliminary
experiments show that the robot learns to hand over a water bottle naturally
and that it adapts to the dynamics of human motion. One challenge for the
future is to combine the model-free learning algorithm with a model-based
planning approach and enable the robot to adapt over human preferences and
object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics
Research (ISRR) 201
Towards a framework to make robots learn to dance
A key motive of human-robot interaction is to make robots and humans interact through different aspects of the real world. As robots become more and more realistic in appearance, so has the desire for them to exhibit complex behaviours. A growing area of interest in terms of complex behaviour is robot dancing. Dance is an entertaining activity that is enjoyed either by being the performer or the spectator. Each dance contain fundamental features that make-up a dance. It is the curiosity for some researchers to model such an activity for robots to perform in human social environments. From current research, most dancing robots are pre-programmed with dance motions and few have the ability to generate their own dance or alter their movements according to human responses while dancing.
This thesis explores the question Can a robot learn to dance? . A dancing framework is proposed to address this question. The Sarsa algorithm and the Softmax algorithm from traditional reinforcement learning form part of the dancing framework to enable a virtual robot learn and adapt to appropriate dance behaviours. The robot follows a progressive approach, utilising the knowledge obtained at each stage of its development to improve the dances that it generates.
The proposed framework addresses three stages of development of a robot s dance: learning ability; creative ability of dance motions, and adaptive ability to human preferences. Learning ability is the ability to make a robot gradually perform the desired dance behaviours. Creative ability is the idea of the robot generating its own dance motions, and structuring them into a dance. Adaptive ability is where the robot changes its dance in response to human feedback. A number of experiments have been conducted to explore these challenges, and verified that the quality of the robot dance can be improved through each stage of the robot s development
- …