70 research outputs found
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
Imitation learning has traditionally been applied to learn a single task from
demonstrations thereof. The requirement of structured and isolated
demonstrations limits the scalability of imitation learning approaches as they
are difficult to apply to real-world scenarios, where robots have to be able to
execute a multitude of tasks. In this paper, we propose a multi-modal imitation
learning framework that is able to segment and imitate skills from unlabelled
and unstructured demonstrations by learning skill segmentation and imitation
learning jointly. The extensive simulation results indicate that our method can
efficiently separate the demonstrations into individual skills and learn to
imitate them using a single multi-modal policy. The video of our experiments is
available at http://sites.google.com/view/nips17intentionganComment: Paper accepted to NIPS 201
MaMiC: Macro and Micro Curriculum for Robotic Reinforcement Learning
Shaping in humans and animals has been shown to be a powerful tool for
learning complex tasks as compared to learning in a randomized fashion. This
makes the problem less complex and enables one to solve the easier sub task at
hand first. Generating a curriculum for such guided learning involves
subjecting the agent to easier goals first, and then gradually increasing their
difficulty. This paper takes a similar direction and proposes a dual curriculum
scheme for solving robotic manipulation tasks with sparse rewards, called
MaMiC. It includes a macro curriculum scheme which divides the task into
multiple sub-tasks followed by a micro curriculum scheme which enables the
agent to learn between such discovered sub-tasks. We show how combining macro
and micro curriculum strategies help in overcoming major exploratory
constraints considered in robot manipulation tasks without having to engineer
any complex rewards. We also illustrate the meaning of the individual curricula
and how they can be used independently based on the task. The performance of
such a dual curriculum scheme is analyzed on the Fetch environments.Comment: To appear in the Proceedings of the 18th International Conference on
Autonomous Agents and Multiagent Systems (AAMAS 2019). (Extended Abstract
Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards
Intrinsic rewards were introduced to simulate how human intelligence works;
they are usually evaluated by intrinsically-motivated play, i.e., playing games
without extrinsic rewards but evaluated with extrinsic rewards. However, none
of the existing intrinsic reward approaches can achieve human-level performance
under this very challenging setting of intrinsically-motivated play. In this
work, we propose a novel megalomania-driven intrinsic reward (called
mega-reward), which, to our knowledge, is the first approach that achieves
human-level performance in intrinsically-motivated play. Intuitively,
mega-reward comes from the observation that infants' intelligence develops when
they try to gain more control on entities in an environment; therefore,
mega-reward aims to maximize the control capabilities of agents on given
entities in a given environment. To formalize mega-reward, a relational
transition model is proposed to bridge the gaps between direct and latent
control. Experimental studies show that mega-reward (i) can greatly outperform
all state-of-the-art intrinsic reward approaches, (ii) generally achieves the
same level of performance as Ex-PPO and professional human-level scores, and
(iii) has also a superior performance when it is incorporated with extrinsic
rewards
- …