10,764 research outputs found
Overcoming Exploration in Reinforcement Learning with Demonstrations
Exploration in environments with sparse rewards has been a persistent problem
in reinforcement learning (RL). Many tasks are natural to specify with a sparse
reward, and manually shaping a reward function can result in suboptimal
performance. However, finding a non-zero reward is exponentially more difficult
with increasing task horizon or action dimensionality. This puts many
real-world tasks out of practical reach of RL methods. In this work, we use
demonstrations to overcome the exploration problem and successfully learn to
perform long-horizon, multi-step robotics tasks with continuous control such as
stacking blocks with a robot arm. Our method, which builds on top of Deep
Deterministic Policy Gradients and Hindsight Experience Replay, provides an
order of magnitude of speedup over RL on simulated robotics tasks. It is simple
to implement and makes only the additional assumption that we can collect a
small set of demonstrations. Furthermore, our method is able to solve tasks not
solvable by either RL or behavior cloning alone, and often ends up
outperforming the demonstrator policy.Comment: 8 pages, ICRA 201
Learning cloth manipulation with demonstrations
Recent advances in Deep Reinforcement learning and computational capabilities of GPUs have led to variety of research being conducted in the learning side of robotics. The main aim being that of making autonomous robots that are capable of learning how to solve a task on their own with minimal requirement for engineering on the planning, vision, or control side. Efforts have been made to learn the manipulation of rigid objects through the help of human demonstrations, specifically in the tasks such as stacking of multiple blocks on top of each other, inserting a pin into a hole, etc. These Deep RL algorithms successfully learn how to complete a task involving the manipulation of rigid objects, but autonomous manipulation of textile objects such as clothes through Deep RL algorithms is still not being studied in the community.
The main objectives of this work involve, 1) implementing the state of the art Deep RL algorithms for rigid object manipulation and getting a deep understanding of the working of these various algorithms, 2) Creating an open-source simulation environment for simulating textile objects such as clothes, 3) Designing Deep RL algorithms for learning autonomous manipulation of textile objects through demonstrations.Peer ReviewedPreprin
Generative Exploration and Exploitation
Sparse reward is one of the biggest challenges in reinforcement learning
(RL). In this paper, we propose a novel method called Generative Exploration
and Exploitation (GENE) to overcome sparse reward. GENE automatically generates
start states to encourage the agent to explore the environment and to exploit
received reward signals. GENE can adaptively tradeoff between exploration and
exploitation according to the varying distributions of states experienced by
the agent as the learning progresses. GENE relies on no prior knowledge about
the environment and can be combined with any RL algorithm, no matter on-policy
or off-policy, single-agent or multi-agent. Empirically, we demonstrate that
GENE significantly outperforms existing methods in three tasks with only binary
rewards, including Maze, Maze Ant, and Cooperative Navigation. Ablation studies
verify the emergence of progressive exploration and automatic reversing.Comment: AAAI'2
- …