Search CORE

12,308 research outputs found

Learning Parameterized Skills

Author: Barto Andrew
Da Silva Bruno
Konidaris George
Publication venue
Publication date: 01/01/2012
Field of study

We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to estimate the topology of the lower-dimensional piecewise-smooth manifold on which the skill policies lie. This manifold models how policy parameters change as task parameters vary. The method identifies the number of charts that compose the manifold and then applies non-linear regression in each chart to construct a parameterized skill by predicting policy parameters from task parameters. We evaluate our method on an underactuated simulated robotic arm tasked with learning to accurately throw darts at a parameterized target location.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Learning Parameterized Skills

Author: Castro da Silva Bruno
Publication venue: ScholarWorks@UMass Amherst
Publication date: 17/03/2015
Field of study

One of the defining characteristics of human intelligence is the ability to acquire and refine skills. Skills are behaviors for solving problems that an agent encounters often—sometimes in different contexts and situations—throughout its lifetime. Identifying important problems that recur and retaining their solutions as skills allows agents to more rapidly solve novel problems by adjusting and combining their existing skills. In this thesis we introduce a general framework for learning reusable parameterized skills. Reusable skills are parameterized procedures that—given a description of a problem to be solved—produce appropriate behaviors or policies. They can be sequentially and hierarchically combined with other skills to produce progressively more abstract and temporally extended behaviors. We identify three major challenges involved in the construction of such skills. First, an agent should be capable of solving a small number of problems and generalizing these experiences to construct a single reusable skill. The skill should be capable of producing appropriate behaviors even when applied to yet unseen variations of a problem. We introduce a method for estimating properties of the lower-dimensional manifold on which problem solutions lie. This allows for the construction of unified models for predicting policies from task parameters. Secondly, the agent should be able to identify when a skill can be hierarchically decomposed into specialized sub-skills. We observe that the policy manifold may be composed of disjoint, piecewise-smooth charts, each one encoding solutions for a subclass of problems. Identifying and modeling sub-skills allows for the aggregation of related behaviors into single, more abstract skills. Finally, the agent should be able to actively select on which problems to practice in order to more rapidly become competent in a skill. Thoughtful and deliberate practice is one of the defining characteristics of human expert performance. By carefully choosing on which problems to practice the agent might more rapidly construct a skill that performs well over a wide range of problems. We address these challenges via a general framework for skill acquisition. We evaluate it on simulated decision-problems and on a physical humanoid robot, and demonstrate that it allows for the efficient and active construction of reusable skills

ScholarWorks@UMass Amherst

Incremental learning of skills in a task-parameterized Gaussian Mixture Model

Author: BD Argall
Carme Torras
D Kulic
Flavio Prieto
Guillem Alenyà
Jose Hoyos
M Pardowitz
WT Townsend
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comProgramming by demonstration techniques facilitate the programming of robots. Some of them allow the generalization of tasks through parameters, although they require new training when trajectories different from the ones used to estimate the model need to be added. One of the ways to re-train a robot is by incremental learning, which supplies additional information of the task and does not require teaching the whole task again. The present study proposes three techniques to add trajectories to a previously estimated task-parameterized Gaussian mixture model. The first technique estimates a new model by accumulating the new trajectory and the set of trajectories generated using the previous model. The second technique permits adding to the parameters of the existent model those obtained for the new trajectories. The third one updates the model parameters by running a modified version of the Expectation-Maximization algorithm, with the information of the new trajectories. The techniques were evaluated in a simulated task and a real one, and they showed better performance than that of the existent model.Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

Author: Forestier Sébastien
Mollard Yoan
Oudeyer Pierre-Yves
Portelas Rémy
Publication venue
Publication date: 24/07/2020
Field of study

Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Model Learning for Look-ahead Exploration in Continuous Control

Author: Agarwal Arpit
Fragkiadaki Katerina
Muelling Katharina
Publication venue
Publication date: 20/11/2018
Field of study

We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies . Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formulations, analogous to options or macroactions. Coarse skill dynamics, i.e., the state transition caused by a (complete) skill execution, are learnt and are unrolled forward during lookahead search. Policy search benefits from temporal abstraction during exploration, though itself operates over low-level primitive actions, and thus the resulting policies does not suffer from suboptimality and inflexibility caused by coarse skill chaining. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parametrized skills as building blocks of the policy itself, as opposed to guiding exploration. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parameterized skills as building blocks of the policy itself, as opposed to guiding exploration.Comment: This is a pre-print of our paper which is accepted in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning Task Priorities from Demonstrations

Author: Caldwell Darwin G.
Calinon Sylvain
Rozo Leonel
Silvério João
Publication venue
Publication date: 20/11/2018
Field of study

Bimanual operations in humanoids offer the possibility to carry out more than one manipulation task at the same time, which in turn introduces the problem of task prioritization. We address this problem from a learning from demonstration perspective, by extending the Task-Parameterized Gaussian Mixture Model (TP-GMM) to Jacobian and null space structures. The proposed approach is tested on bimanual skills but can be applied in any scenario where the prioritization between potentially conflicting tasks needs to be learned. We evaluate the proposed framework in: two different tasks with humanoids requiring the learning of priorities and a loco-manipulation scenario, showing that the approach can be exploited to learn the prioritization of multiple tasks in parallel.Comment: Accepted for publication at the IEEE Transactions on Robotic

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Incremental Bootstrapping of Parameterized Motor Skills

Author: Queißer Jeffrey
Reinhart Felix
Steil Jochen J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Queißer J, Reinhart F, Steil JJ. Incremental Bootstrapping of Parameterized Motor Skills. In: Proc. IEEE Humanoids. IEEE; 2016.Many motor skills have an intrinsic, low-dimensional parameterization, e.g. reaching through a grid to different targets. Repeated policy search for new parameterizations of such a skill is inefficient, because the structure of the skill variability is not exploited. This issue has been previously addressed by learning mappings from task parameters to policy parameters. In this work, we introduce a bootstrapping technique that establishes such parameterized skills incrementally. The approach combines iterative learning with state-of-the-art black-box policy optimization. We investigate the benefits of incrementally learning parameterized skills for efficient policy retrieval and show that the number of required rollouts can be significantly reduced when optimizing policies for novel tasks. The approach is demonstrated for several parameterized motor tasks including upper-body reaching motion generation for the humanoid robot COMAN

Publications at Bielefeld University