15,648 research outputs found

    Exploring Multi-Agent Reinforcement Learning for Mobile Manipulation

    Get PDF
    To make robots valuable in our everyday lives, they need to be able to make good decisions even in unexpected situations. Reinforcement learning is a paradigm that aims to learn decision-making models for robots without the need for direct examples of the correct decisions. For this type of robot learning, it is common practice to learn a single central model that controls the entire robot. This work is motivated by advances in modular and swarm robotics, where multiple robots or decision-makers collaborate to complete a task. Instead of learning a single central model, we explore the idea of learning multiple decision-making models, each controlling a different part of the robot. In particular, we investigate whether providing the different models with different sensing capabilities helps the robot to learn or to be robust to perturbations. We formulate these problems as multi-agent problems and use a multi-agent reinforcement learning algorithm to solve them. To evaluate our approach, we design a mobile manipulation task and implement a simulation-based training pipeline to produce decision-making models that can complete the task. The trained models are then directly transferred to a real autonomous mobile manipulator system. Several experiments are performed on the real system to compare the performance and robustness against the usual central model baseline. Our experimental results show that our approach can learn faster and produce decision-making models that are more robust to perturbations

    CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning

    Get PDF
    In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.Comment: Accepted at ICML 201

    Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

    Full text link
    Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills
    corecore