4,018 research outputs found
Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation
Manipulation of deformable objects, such as ropes and cloth, is an important
but challenging problem in robotics. We present a learning-based system where a
robot takes as input a sequence of images of a human manipulating a rope from
an initial to goal configuration, and outputs a sequence of actions that can
reproduce the human demonstration, using only monocular images as input. To
perform this task, the robot learns a pixel-level inverse dynamics model of
rope manipulation directly from images in a self-supervised manner, using about
60K interactions with the rope collected autonomously by the robot. The human
demonstration provides a high-level plan of what to do and the low-level
inverse model is used to execute the plan. We show that by combining the high
and low-level plans, the robot can successfully manipulate a rope into a
variety of target shapes using only a sequence of human-provided images for
direction.Comment: 8 pages, accepted to International Conference on Robotics and
Automation (ICRA) 201
Causal Confusion in Imitation Learning
Behavioral cloning reduces policy learning to supervised learning by training
a discriminative model to predict expert actions given observations. Such
discriminative models are non-causal: the training procedure is unaware of the
causal structure of the interaction between the expert and the environment. We
point out that ignoring causality is particularly damaging because of the
distributional shift in imitation learning. In particular, it leads to a
counter-intuitive "causal misidentification" phenomenon: access to more
information can yield worse performance. We investigate how this problem
arises, and propose a solution to combat it through targeted
interventions---either environment interaction or expert queries---to determine
the correct causal model. We show that causal misidentification occurs in
several benchmark control domains as well as realistic driving settings, and
validate our solution against DAgger and other baselines and ablations.Comment: Published at NeurIPS 2019 9 pages, plus references and appendice
- …