36 research outputs found

    One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors

    Full text link
    One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand. Model-based reinforcement learning can achieve good sample efficiency, but requires the ability to learn a model of the dynamics that is good enough to learn an effective policy. In this work, we develop a model-based reinforcement learning algorithm that combines prior knowledge from previous tasks with online adaptation of the dynamics model. These two ingredients enable highly sample-efficient learning even in regimes where estimating the true dynamics is very difficult, since the online model adaptation allows the method to locally compensate for unmodeled variation in the dynamics. We encode the prior experience into a neural network dynamics model, adapt it online by progressively refitting a local linear model of the dynamics, and use model predictive control to plan under these dynamics. Our experimental results show that this approach can be used to solve a variety of complex robotic manipulation tasks in just a single attempt, using prior data from other manipulation behaviors

    Multiform Adaptive Robot Skill Learning from Humans

    Full text link
    Object manipulation is a basic element in everyday human lives. Robotic manipulation has progressed from maneuvering single-rigid-body objects with firm grasping to maneuvering soft objects and handling contact-rich actions. Meanwhile, technologies such as robot learning from demonstration have enabled humans to intuitively train robots. This paper discusses a new level of robotic learning-based manipulation. In contrast to the single form of learning from demonstration, we propose a multiform learning approach that integrates additional forms of skill acquisition, including adaptive learning from definition and evaluation. Moreover, going beyond state-of-the-art technologies of handling purely rigid or soft objects in a pseudo-static manner, our work allows robots to learn to handle partly rigid partly soft objects with time-critical skills and sophisticated contact control. Such capability of robotic manipulation offers a variety of new possibilities in human-robot interaction.Comment: Accepted to 2017 Dynamic Systems and Control Conference (DSCC), Tysons Corner, VA, October 11-1

    Robot Composite Learning and the Nunchaku Flipping Challenge

    Full text link
    Advanced motor skills are essential for robots to physically coexist with humans. Much research on robot dynamics and control has achieved success on hyper robot motor capabilities, but mostly through heavily case-specific engineering. Meanwhile, in terms of robot acquiring skills in a ubiquitous manner, robot learning from human demonstration (LfD) has achieved great progress, but still has limitations handling dynamic skills and compound actions. In this paper, we present a composite learning scheme which goes beyond LfD and integrates robot learning from human definition, demonstration, and evaluation. The method tackles advanced motor skills that require dynamic time-critical maneuver, complex contact control, and handling partly soft partly rigid objects. We also introduce the "nunchaku flipping challenge", an extreme test that puts hard requirements to all these three aspects. Continued from our previous presentations, this paper introduces the latest update of the composite learning scheme and the physical success of the nunchaku flipping challenge

    Deep Haptic Model Predictive Control for Robot-Assisted Dressing

    Full text link
    Robot-assisted dressing offers an opportunity to benefit the lives of many people with disabilities, such as some older adults. However, robots currently lack common sense about the physical implications of their actions on people. The physical implications of dressing are complicated by non-rigid garments, which can result in a robot indirectly applying high forces to a person's body. We present a deep recurrent model that, when given a proposed action by the robot, predicts the forces a garment will apply to a person's body. We also show that a robot can provide better dressing assistance by using this model with model predictive control. The predictions made by our model only use haptic and kinematic observations from the robot's end effector, which are readily attainable. Collecting training data from real world physical human-robot interaction can be time consuming, costly, and put people at risk. Instead, we train our predictive model using data collected in an entirely self-supervised fashion from a physics-based simulation. We evaluated our approach with a PR2 robot that attempted to pull a hospital gown onto the arms of 10 human participants. With a 0.2s prediction horizon, our controller succeeded at high rates and lowered applied force while navigating the garment around a persons fist and elbow without getting caught. Shorter prediction horizons resulted in significantly reduced performance with the sleeve catching on the participants' fists and elbows, demonstrating the value of our model's predictions. These behaviors of mitigating catches emerged from our deep predictive model and the controller objective function, which primarily penalizes high forces.Comment: 8 pages, 12 figures, 1 table, 2018 IEEE International Conference on Robotics and Automation (ICRA

    Probabilistically Safe Policy Transfer

    Full text link
    Although learning-based methods have great potential for robotics, one concern is that a robot that updates its parameters might cause large amounts of damage before it learns the optimal policy. We formalize the idea of safe learning in a probabilistic sense by defining an optimization problem: we desire to maximize the expected return while keeping the expected damage below a given safety limit. We study this optimization for the case of a robot manipulator with safety-based torque limits. We would like to ensure that the damage constraint is maintained at every step of the optimization and not just at convergence. To achieve this aim, we introduce a novel method which predicts how modifying the torque limit, as well as how updating the policy parameters, might affect the robot's safety. We show through a number of experiments that our approach allows the robot to improve its performance while ensuring that the expected damage constraint is not violated during the learning process
    corecore