5,354 research outputs found

    Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

    Full text link
    Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills

    Towards the Safety of Human-in-the-Loop Robotics: Challenges and Opportunities for Safety Assurance of Robotic Co-Workers

    Get PDF
    The success of the human-robot co-worker team in a flexible manufacturing environment where robots learn from demonstration heavily relies on the correct and safe operation of the robot. How this can be achieved is a challenge that requires addressing both technical as well as human-centric research questions. In this paper we discuss the state of the art in safety assurance, existing as well as emerging standards in this area, and the need for new approaches to safety assurance in the context of learning machines. We then focus on robotic learning from demonstration, the challenges these techniques pose to safety assurance and indicate opportunities to integrate safety considerations into algorithms "by design". Finally, from a human-centric perspective, we stipulate that, to achieve high levels of safety and ultimately trust, the robotic co-worker must meet the innate expectations of the humans it works with. It is our aim to stimulate a discussion focused on the safety aspects of human-in-the-loop robotics, and to foster multidisciplinary collaboration to address the research challenges identified

    Online Learning of a Memory for Learning Rates

    Full text link
    The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available: https://github.com/fmeier/online-meta-learning ; video pitch available: https://youtu.be/9PzQ25FPPO
    corecore