Search CORE

53 research outputs found

Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation

Author: Abbeel Pieter
Agrawal Pulkit
Chen Dian
Isola Phillip
Levine Sergey
Malik Jitendra
Nair Ashvin
Publication venue
Publication date: 06/03/2017
Field of study

Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics. We present a learning-based system where a robot takes as input a sequence of images of a human manipulating a rope from an initial to goal configuration, and outputs a sequence of actions that can reproduce the human demonstration, using only monocular images as input. To perform this task, the robot learns a pixel-level inverse dynamics model of rope manipulation directly from images in a self-supervised manner, using about 60K interactions with the rope collected autonomously by the robot. The human demonstration provides a high-level plan of what to do and the low-level inverse model is used to execute the plan. We show that by combining the high and low-level plans, the robot can successfully manipulate a rope into a variety of target shapes using only a sequence of human-provided images for direction.Comment: 8 pages, accepted to International Conference on Robotics and Automation (ICRA) 201

arXiv.org e-Print Archive

Crossref

Model Learning for Look-ahead Exploration in Continuous Control

Author: Agarwal Arpit
Fragkiadaki Katerina
Muelling Katharina
Publication venue
Publication date: 20/11/2018
Field of study

We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies . Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formulations, analogous to options or macroactions. Coarse skill dynamics, i.e., the state transition caused by a (complete) skill execution, are learnt and are unrolled forward during lookahead search. Policy search benefits from temporal abstraction during exploration, though itself operates over low-level primitive actions, and thus the resulting policies does not suffer from suboptimality and inflexibility caused by coarse skill chaining. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parametrized skills as building blocks of the policy itself, as opposed to guiding exploration. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parameterized skills as building blocks of the policy itself, as opposed to guiding exploration.Comment: This is a pre-print of our paper which is accepted in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Model-based Manipulation of Deformable Linear Objects by Multivariate Dynamic Splines

Author: Palli G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

In this paper, the modelling and the simulation of a Deformable Linear Object (DLO) manipulation are reported. The main motivation of this study is to define a strategy to enable a robotic manipulator to predict in real time the shape a DLO will achieve during the execution of a manipulation action. To accomplish this target in a reasonable time, according to the possibility of adopting this solution in an industrial manufacturing system, an approximate but physically consistent model of the DLO is adopted considering the predominant plasticity of the object to be manipulated, as in the case of electric cable manipulation. The DLO manipulation model is based on multivariate dynamic splines solved iteratively in real-time to interpolate the DLO shape during the manipulation sequence. The systems assumes to be able to detect the initial configuration of the DLO at each iteration of the algorithm by means of a proper vision system. Preliminary simulation results are presented to show the effectiveness of the method

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Interactive Imitation Learning in State-Space

Author: Celemin Carlos
Jauhri Snehal
Kober Jens
Publication venue
Publication date: 01/01/2020
Field of study

Imitation Learning techniques enable programming the behavior of agents through demonstrations rather than manual engineering. However, they are limited by the quality of available demonstration data. Interactive Imitation Learning techniques can improve the efficacy of learning since they involve teachers providing feedback while the agent executes its task. In this work, we propose a novel Interactive Learning technique that uses human feedback in state-space to train and improve agent behavior (as opposed to alternative methods that use feedback in action-space). Our method titled Teaching Imitative Policies in State-space~(TIPS) enables providing guidance to the agent in terms of `changing its state' which is often more intuitive for a human demonstrator. Through continuous improvement via corrective feedback, agents trained by non-expert demonstrators using TIPS outperformed the demonstrator and conventional Imitation Learning agents.Comment: Presented at the 4th Conference on Robot Learning (CoRL) 2020, 11 pages, 4 figure

arXiv.org e-Print Archive

TU Delft Repository

Learning Generalized Reactive Policies using Deep Neural Networks

Author: Abbeel Pieter
Goldstein Maxwell
Groshev Edward
Srivastava Siddharth
Tamar Aviv
Publication venue
Publication date: 15/06/2018
Field of study

We present a new approach to learning for planning, where knowledge acquired while solving a given set of planning problems is used to plan faster in related, but new problem instances. We show that a deep neural network can be used to learn and represent a \emph{generalized reactive policy} (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances. In contrast to prior efforts in this direction, our approach significantly reduces the dependence of learning on handcrafted domain knowledge or feature selection. Instead, the GRP is trained from scratch using a set of successful execution traces. We show that our approach can also be used to automatically learn a heuristic function that can be used in directed search algorithms. We evaluate our approach using an extensive suite of experiments on two challenging planning problem domains and show that our approach facilitates learning complex decision making policies and powerful heuristic functions with minimal human input. Videos of our results are available at goo.gl/Hpy4e3

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications