8 research outputs found
From virtual demonstration to real-world manipulation using LSTM and MDN
Robots assisting the disabled or elderly must perform complex manipulation
tasks and must adapt to the home environment and preferences of their user.
Learning from demonstration is a promising choice, that would allow the
non-technical user to teach the robot different tasks. However, collecting
demonstrations in the home environment of a disabled user is time consuming,
disruptive to the comfort of the user, and presents safety challenges. It would
be desirable to perform the demonstrations in a virtual environment. In this
paper we describe a solution to the challenging problem of behavior transfer
from virtual demonstration to a physical robot. The virtual demonstrations are
used to train a deep neural network based controller, which is using a Long
Short Term Memory (LSTM) recurrent neural network to generate trajectories. The
training process uses a Mixture Density Network (MDN) to calculate an error
signal suitable for the multimodal nature of demonstrations. The controller
learned in the virtual environment is transferred to a physical robot (a
Rethink Robotics Baxter). An off-the-shelf vision component is used to
substitute for geometric knowledge available in the simulation and an inverse
kinematics module is used to allow the Baxter to enact the trajectory. Our
experimental studies validate the three contributions of the paper: (1) the
controller learned from virtual demonstrations can be used to successfully
perform the manipulation tasks on a physical robot, (2) the LSTM+MDN
architectural choice outperforms other choices, such as the use of feedforward
networks and mean-squared error based training signals and (3) allowing
imperfect demonstrations in the training set also allows the controller to
learn how to correct its manipulation mistakes
DoorGym: A Scalable Door Opening Environment And Baseline Agent
In order to practically implement the door opening task, a policy ought to be
robust to a wide distribution of door types and environment settings.
Reinforcement Learning (RL) with Domain Randomization (DR) is a promising
technique to enforce policy generalization, however, there are only a few
accessible training environments that are inherently designed to train agents
in domain randomized environments. We introduce DoorGym, an open-source door
opening simulation framework designed to utilize domain randomization to train
a stable policy. We intend for our environment to lie at the intersection of
domain transfer, practical tasks, and realism. We also provide baseline
Proximal Policy Optimization and Soft Actor-Critic implementations, which
achieves success rates between 0% up to 95% for opening various types of doors
in this environment. Moreover, the real-world transfer experiment shows the
trained policy is able to work in the real world. Environment kit available
here: https://github.com/PSVL/DoorGym/Comment: Full version (Real world transfer experiments result
Deep Visual Foresight for Planning Robot Motion
A key challenge in scaling up robot learning to many skills and environments
is removing the need for human supervision, so that robots can collect their
own data and improve their own performance without being limited by the cost of
requesting human feedback. Model-based reinforcement learning holds the promise
of enabling an agent to learn to predict the effects of its actions, which
could provide flexible predictive models for a wide range of tasks and
environments, without detailed human supervision. We develop a method for
combining deep action-conditioned video prediction models with model-predictive
control that uses entirely unlabeled training data. Our approach does not
require a calibrated camera, an instrumented training set-up, nor precise
sensing and actuation. Our results show that our method enables a real robot to
perform nonprehensile manipulation -- pushing objects -- and can handle novel
objects not seen during training.Comment: ICRA 2017. Supplementary video:
https://sites.google.com/site/robotforesight
Robust Door Operation with the Toyota Human Support Robot. Robotic perception, manipulation and learning
Robots are progressively spreading to urban, social and assistive domains. Service robots operating in domestic environments typically face a variety of objects they have to deal with to fulfill their tasks. Some of these objects are articulated such as cabinet doors and drawers. The ability to deal with such objects is relevant, as for example navigate between rooms or assist humans in their mobility. The exploration of this task rises interesting questions in some of the main robotic threads such as perception, manipulation and learning. In this work a general framework to robustly operate different types of doors with a mobile manipulator robot is proposed. To push the state-of-the-art, a novel algorithm, that fuses a Convolutional Neural Network with point cloud processing for estimating the end-effector grasping pose in real-time for multiple handles simultaneously from single RGB-D images, is proposed. Also, a Bayesian framework that embodies the robot with the ability to learn the kinematic model of the door from observations of its motion, as well as from previous experiences or human demonstrations. Combining this probabilistic approach with state-of-the-art motion planninOutgoin
Multi-modal Skill Memories for Online Learning of Interactive Robot Movement Generation
QueiĂer J. Multi-modal Skill Memories for Online Learning of Interactive Robot Movement Generation. Bielefeld: UniversitĂ€t Bielefeld; 2018.Modern robotic applications pose complex requirements with respect to the adaptation of
actions regarding the variability in a given task. Reinforcement learning can optimize for
changing conditions, but relearning from scratch is hardly feasible due to the high number of
required rollouts. This work proposes a parameterized skill that generalizes to new actions
for changing task parameters. The actions are encoded by a meta-learner that provides
parameters for task-specific dynamic motion primitives. Experimental evaluation shows that
the utilization of parameterized skills for initialization of the optimization process leads to a
more effective incremental task learning. A proposed hybrid optimization method combines
a fast coarse optimization on a manifold of policy parameters with a fine-grained parameter
search in the unrestricted space of actions. It is shown that the developed algorithm reduces
the number of required rollouts for adaptation to new task conditions. Further, this work
presents a transfer learning approach for adaptation of learned skills to new situations.
Application in illustrative toy scenarios, for a 10-DOF planar arm, a humanoid robot point
reaching task and parameterized drumming on a pneumatic robot validate the approach.
But parameterized skills that are applied on complex robotic systems pose further
challenges: the dynamics of the robot and the interaction with the environment introduce
model inaccuracies. In particular, high-level skill acquisition on highly compliant robotic
systems such as pneumatically driven or soft actuators is hardly feasible. Since learning of
the complete dynamics model is not feasible due to the high complexity, this thesis examines
two alternative approaches: First, an improvement of the low-level control based on an
equilibrium model of the robot. Utilization of an equilibrium model reduces the learning
complexity and this thesis evaluates its applicability for control of pneumatic and industrial
light-weight robots. Second, an extension of parameterized skills to generalize for forward
signals of action primitives that result in an enhanced control quality of complex robotic
systems. This thesis argues for a shift in the complexity of learning the full dynamics of the
robot to a lower dimensional task-related learning problem. Due to the generalization in
relation to the task variability, online learning for complex robots as well as complex scenarios
becomes feasible. An experimental evaluation investigates the generalization capabilities of
the proposed online learning system for robot motion generation. Evaluation is performed
through simulation of a compliant 2-DOF arm and scalability to a complex robotic system
is demonstrated for a pneumatically driven humanoid robot with 8-DOF