36 research outputs found
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors
One of the key challenges in applying reinforcement learning to complex
robotic control tasks is the need to gather large amounts of experience in
order to find an effective policy for the task at hand. Model-based
reinforcement learning can achieve good sample efficiency, but requires the
ability to learn a model of the dynamics that is good enough to learn an
effective policy. In this work, we develop a model-based reinforcement learning
algorithm that combines prior knowledge from previous tasks with online
adaptation of the dynamics model. These two ingredients enable highly
sample-efficient learning even in regimes where estimating the true dynamics is
very difficult, since the online model adaptation allows the method to locally
compensate for unmodeled variation in the dynamics. We encode the prior
experience into a neural network dynamics model, adapt it online by
progressively refitting a local linear model of the dynamics, and use model
predictive control to plan under these dynamics. Our experimental results show
that this approach can be used to solve a variety of complex robotic
manipulation tasks in just a single attempt, using prior data from other
manipulation behaviors
Multiform Adaptive Robot Skill Learning from Humans
Object manipulation is a basic element in everyday human lives. Robotic
manipulation has progressed from maneuvering single-rigid-body objects with
firm grasping to maneuvering soft objects and handling contact-rich actions.
Meanwhile, technologies such as robot learning from demonstration have enabled
humans to intuitively train robots. This paper discusses a new level of robotic
learning-based manipulation. In contrast to the single form of learning from
demonstration, we propose a multiform learning approach that integrates
additional forms of skill acquisition, including adaptive learning from
definition and evaluation. Moreover, going beyond state-of-the-art technologies
of handling purely rigid or soft objects in a pseudo-static manner, our work
allows robots to learn to handle partly rigid partly soft objects with
time-critical skills and sophisticated contact control. Such capability of
robotic manipulation offers a variety of new possibilities in human-robot
interaction.Comment: Accepted to 2017 Dynamic Systems and Control Conference (DSCC),
Tysons Corner, VA, October 11-1
Robot Composite Learning and the Nunchaku Flipping Challenge
Advanced motor skills are essential for robots to physically coexist with
humans. Much research on robot dynamics and control has achieved success on
hyper robot motor capabilities, but mostly through heavily case-specific
engineering. Meanwhile, in terms of robot acquiring skills in a ubiquitous
manner, robot learning from human demonstration (LfD) has achieved great
progress, but still has limitations handling dynamic skills and compound
actions. In this paper, we present a composite learning scheme which goes
beyond LfD and integrates robot learning from human definition, demonstration,
and evaluation. The method tackles advanced motor skills that require dynamic
time-critical maneuver, complex contact control, and handling partly soft
partly rigid objects. We also introduce the "nunchaku flipping challenge", an
extreme test that puts hard requirements to all these three aspects. Continued
from our previous presentations, this paper introduces the latest update of the
composite learning scheme and the physical success of the nunchaku flipping
challenge
Deep Haptic Model Predictive Control for Robot-Assisted Dressing
Robot-assisted dressing offers an opportunity to benefit the lives of many
people with disabilities, such as some older adults. However, robots currently
lack common sense about the physical implications of their actions on people.
The physical implications of dressing are complicated by non-rigid garments,
which can result in a robot indirectly applying high forces to a person's body.
We present a deep recurrent model that, when given a proposed action by the
robot, predicts the forces a garment will apply to a person's body. We also
show that a robot can provide better dressing assistance by using this model
with model predictive control. The predictions made by our model only use
haptic and kinematic observations from the robot's end effector, which are
readily attainable. Collecting training data from real world physical
human-robot interaction can be time consuming, costly, and put people at risk.
Instead, we train our predictive model using data collected in an entirely
self-supervised fashion from a physics-based simulation. We evaluated our
approach with a PR2 robot that attempted to pull a hospital gown onto the arms
of 10 human participants. With a 0.2s prediction horizon, our controller
succeeded at high rates and lowered applied force while navigating the garment
around a persons fist and elbow without getting caught. Shorter prediction
horizons resulted in significantly reduced performance with the sleeve catching
on the participants' fists and elbows, demonstrating the value of our model's
predictions. These behaviors of mitigating catches emerged from our deep
predictive model and the controller objective function, which primarily
penalizes high forces.Comment: 8 pages, 12 figures, 1 table, 2018 IEEE International Conference on
Robotics and Automation (ICRA
Probabilistically Safe Policy Transfer
Although learning-based methods have great potential for robotics, one
concern is that a robot that updates its parameters might cause large amounts
of damage before it learns the optimal policy. We formalize the idea of safe
learning in a probabilistic sense by defining an optimization problem: we
desire to maximize the expected return while keeping the expected damage below
a given safety limit. We study this optimization for the case of a robot
manipulator with safety-based torque limits. We would like to ensure that the
damage constraint is maintained at every step of the optimization and not just
at convergence. To achieve this aim, we introduce a novel method which predicts
how modifying the torque limit, as well as how updating the policy parameters,
might affect the robot's safety. We show through a number of experiments that
our approach allows the robot to improve its performance while ensuring that
the expected damage constraint is not violated during the learning process