14,548 research outputs found
Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in Unstable Conditions
Grasping small objects surrounded by unstable or non-rigid material plays a
crucial role in applications such as surgery, harvesting, construction,
disaster recovery, and assisted feeding. This task is especially difficult when
fine manipulation is required in the presence of sensor noise and perception
errors; errors inevitably trigger dynamic motion, which is challenging to model
precisely. Circumventing the difficulty to build accurate models for contacts
and dynamics, data-driven methods like reinforcement learning (RL) can optimize
task performance via trial and error, reducing the need for accurate models of
contacts and dynamics. Applying RL methods to real robots, however, has been
hindered by factors such as prohibitively high sample complexity or the high
training infrastructure cost for providing resets on hardware. This work
presents CherryBot, an RL system that uses chopsticks for fine manipulation
that surpasses human reactiveness for some dynamic grasping tasks. By
integrating imprecise simulators, suboptimal demonstrations and external state
estimation, we study how to make a real-world robot learning system sample
efficient and general while reducing the human effort required for supervision.
Our system shows continual improvement through 30 minutes of real-world
interaction: through reactive retry, it achieves an almost 100% success rate on
the demanding task of using chopsticks to grasp small objects swinging in the
air. We demonstrate the reactiveness, robustness and generalizability of
CherryBot to varying object shapes and dynamics (e.g., external disturbances
like wind and human perturbations). Videos are available at
https://goodcherrybot.github.io/
Tightly-coupled manipulation pipelines: Combining traditional pipelines and end-to-end learning
Traditionally, robot manipulation tasks are solved by engineering solutions in a modular fashion --- typically consisting of object detection, pose estimation, grasp planning, motion planning, and finally run a control algorithm to execute the planned motion. This traditional approach to robot manipulation separates the hard problem of manipulation into several self-contained stages, which can be developed independently, and gives interpretable outputs at each stage of the pipeline. However, this approach comes with a plethora of issues, most notably, their generalisability to a broad range of tasks; it is common that as tasks get more difficult, the systems become increasingly complex.
To combat the flaws of these systems, recent trends have seen robots visually learning to predict actions and grasp locations directly from sensor input in an end-to-end manner using deep neural networks, without the need to explicitly model the in-between modules. This thesis investigates a sample of methods, which fall somewhere on a spectrum from pipelined to fully end-to-end, which we believe to be more advantageous for developing a general manipulation system; one that could eventually be used in highly dynamic and unpredictable household environments.
The investigation starts at the far end of the spectrum, where we explore learning an end-to-end controller in simulation and then transferring to the real world by employing domain randomisation, and finish on the other end, with a new pipeline, where the individual modules bear little resemblance to the "traditional" ones. The thesis concludes with a proposition of a new paradigm: Tightly-coupled Manipulation Pipelines (TMP). Rather than learning all modules implicitly in one large, end-to-end network or conversely, having individual, pre-defined modules that are developed independently, TMPs suggest taking the best of both world by tightly coupling actions to observations, whilst still maintaining structure via an undefined number of learned modules, which do not have to bear any resemblance to the modules seen in "traditional" systems.Open Acces
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Learning policies from previously recorded data is a promising direction for
real-world robotics tasks, as online learning is often infeasible. Dexterous
manipulation in particular remains an open problem in its general form. The
combination of offline reinforcement learning with large diverse datasets,
however, has the potential to lead to a breakthrough in this challenging domain
analogously to the rapid progress made in supervised learning in recent years.
To coordinate the efforts of the research community toward tackling this
problem, we propose a benchmark including: i) a large collection of data for
offline learning from a dexterous manipulation platform on two tasks, obtained
with capable RL agents trained in simulation; ii) the option to execute learned
policies on a real-world robotic system and a simulation for efficient
debugging. We evaluate prominent open-sourced offline reinforcement learning
algorithms on the datasets and provide a reproducible experimental setup for
offline reinforcement learning on real systems.Comment: The Eleventh International Conference on Learning Representations.
2022. Published at ICLR 2023. Datasets available at
https://github.com/rr-learning/trifinger_rl_dataset
- …