1,220 research outputs found
Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning
Learning robotic manipulation tasks using reinforcement learning with sparse
rewards is currently impractical due to the outrageous data requirements. Many
practical tasks require manipulation of multiple objects, and the complexity of
such tasks increases with the number of objects. Learning from a curriculum of
increasingly complex tasks appears to be a natural solution, but unfortunately,
does not work for many scenarios. We hypothesize that the inability of the
state-of-the-art algorithms to effectively utilize a task curriculum stems from
the absence of inductive biases for transferring knowledge from simpler to
complex tasks. We show that graph-based relational architectures overcome this
limitation and enable learning of complex tasks when provided with a simple
curriculum of tasks with increasing numbers of objects. We demonstrate the
utility of our framework on a simulated block stacking task. Starting from
scratch, our agent learns to stack six blocks into a tower. Despite using
step-wise sparse rewards, our method is orders of magnitude more data-efficient
and outperforms the existing state-of-the-art method that utilizes human
demonstrations. Furthermore, the learned policy exhibits zero-shot
generalization, successfully stacking blocks into taller towers and previously
unseen configurations such as pyramids, without any further training.Comment: 10 pages, 4 figures and 1 table in main article, 3 figures and 3
tables in appendix. Supplementary website and videos at
https://richardrl.github.io/relational-rl
Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning
Enabling autonomous robots to interact in unstructured environments with
dynamic objects requires manipulation capabilities that can deal with clutter,
changes, and objects' variability. This paper presents a comparison of
different reinforcement learning-based approaches for object picking with a
robotic manipulator. We learn closed-loop policies mapping depth camera inputs
to motion commands and compare different approaches to keep the problem
tractable, including reward shaping, curriculum learning and using a policy
pre-trained on a task with a reduced action set to warm-start the full problem.
For efficient and more flexible data collection, we train in simulation and
transfer the policies to a real robot. We show that using curriculum learning,
policies learned with a sparse reward formulation can be trained at similar
rates as with a shaped reward. These policies result in success rates
comparable to the policy initialized on the simplified task. We could
successfully transfer these policies to the real robot with only minor
modifications of the depth image filtering. We found that using a heuristic to
warm-start the training was useful to enforce desired behavior, while the
policies trained from scratch using a curriculum learned better to cope with
unseen scenarios where objects are removed.Comment: 8 pages, video available at https://youtu.be/ii16Zejmf-
- …