Search CORE

9,025 research outputs found

Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

Author: Chatzilygeroudis Konstantinos
Mouret Jean-Baptiste
Publication venue
Publication date: 13/03/2018
Field of study

The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Quantifying the Evolutionary Self Structuring of Embodied Cognitive Networks

Author: Bonsignorio Fabio
Publication venue: 'MIT Press - Journals'
Publication date: 07/12/2012
Field of study

We outline a possible theoretical framework for the quantitative modeling of networked embodied cognitive systems. We notice that: 1) information self structuring through sensory-motor coordination does not deterministically occur in Rn vector space, a generic multivariable space, but in SE(3), the group structure of the possible motions of a body in space; 2) it happens in a stochastic open ended environment. These observations may simplify, at the price of a certain abstraction, the modeling and the design of self organization processes based on the maximization of some informational measures, such as mutual information. Furthermore, by providing closed form or computationally lighter algorithms, it may significantly reduce the computational burden of their implementation. We propose a modeling framework which aims to give new tools for the design of networks of new artificial self organizing, embodied and intelligent agents and the reverse engineering of natural ones. At this point, it represents much a theoretical conjecture and it has still to be experimentally verified whether this model will be useful in practice.

arXiv.org e-Print Archive

CiteSeerX

Interactive Co-Design of Form and Function for Legged Robots using the Adjoint Method

Author: Coros Stelian
Desai Ruta
Li Beichen
Yuan Ye
Publication venue
Publication date: 15/04/2018
Field of study

Our goal is to make robotics more accessible to casual users by reducing the domain knowledge required in designing and building robots. Towards this goal, we present an interactive computational design system that enables users to design legged robots with desired morphologies and behaviors by specifying higher level descriptions. The core of our method is a design optimization technique that reasons about the structure, and motion of a robot in coupled manner in order to achieve user-specified robot behavior, and performance. We are inspired by the recent works that also aim to jointly optimize robot's form and function. However, through efficient computation of necessary design changes, our approach enables us to keep user-in-the-loop for interactive applications. We evaluate our system in simulation by automatically improving robot designs for multiple scenarios. Starting with initial user designs that are physically infeasible or inadequate to perform the user-desired task, we show optimized designs that achieve user-specifications, all while ensuring an interactive design flow.Comment: 8 pages; added link of the accompanying vide

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad Tecnológica de Panamá

Portal de Revistas Académicas UTP (Universidad Tecnológica de Panamá)

Reset-free Trial-and-Error Learning for Robot Damage Recovery

Author: Baranes
Blanke
Bongard
Browne
Calandra
Carlson
Corbato
Cully
DeDonato
Deisenroth
Deisenroth
Deisenroth
Droniou
Durrant-Whyte
Guizzo
Hester
Isermann
Jean-Baptiste Mouret
Kavraki
Kober
Konstantinos Chatzilygeroudis
Koos
LaValle
LaValle
Lengagne
Mnih
Mostafa
Mouret
Nguyen
Nguyen-Tuong
Nori
Peters
Pugh
Quiñonero-Candela
Rasmussen
Ren
Shahriari
Silver
Stulp
Sutton
Vassilis Vassiliades
Verma
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

The high probability of hardware failures prevents many advanced robots (e.g., legged robots) from being confidently deployed in real-world situations (e.g., post-disaster rescue). Instead of attempting to diagnose the failures, robots could adapt by trial-and-error in order to be able to complete their tasks. In this situation, damage recovery can be seen as a Reinforcement Learning (RL) problem. However, the best RL algorithms for robotics require the robot and the environment to be reset to an initial state after each episode, that is, the robot is not learning autonomously. In addition, most of the RL methods for robotics do not scale well with complex robots (e.g., walking robots) and either cannot be used at all or take too long to converge to a solution (e.g., hours of learning). In this paper, we introduce a novel learning algorithm called "Reset-free Trial-and-Error" (RTE) that (1) breaks the complexity by pre-generating hundreds of possible behaviors with a dynamics simulator of the intact robot, and (2) allows complex robots to quickly recover from damage while completing their tasks and taking the environment into account. We evaluate our algorithm on a simulated wheeled robot, a simulated six-legged robot, and a real six-legged walking robot that are damaged in several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and whose objective is to reach a sequence of targets in an arena. Our experiments show that the robots can recover most of their locomotion abilities in an environment with obstacles, and without any human intervention.Comment: 18 pages, 16 figures, 3 tables, 6 pseudocodes/algorithms, video at https://youtu.be/IqtyHFrb3BU, code at https://github.com/resibots/chatzilygeroudis_2018_rt

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Evolution of central pattern generators for the control of a five-link bipedal walking mechanism

Author: A. J. Ijspeert
A. Prochazka
A. Prochazka
A. Sano
B. Adams
C. Chevallereau
E. Marder
F. Delcomyn
G. L. Liu
G. Taga
J. A. Ellis
J. Pratt
J. Ruiz-del-Solar
K. Matsuoka
K. Matsuoka
M. A. Lewis
M. H. Raibert
M. H. Raibert
M. L. Swinson
P. W. Latham
R. Heliot
R. J. Peterka
S. Aoi
S. H. Collins
S. Rossignol
T. Geng
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2012
Field of study

Central pattern generators (CPGs), with a basis is neurophysiological studies, are a type of neural network for the generation of rhythmic motion. While CPGs are being increasingly used in robot control, most applications are hand-tuned for a specific task and it is acknowledged in the field that generic methods and design principles for creating individual networks for a given task are lacking. This study presents an approach where the connectivity and oscillatory parameters of a CPG network are determined by an evolutionary algorithm with fitness evaluations in a realistic simulation with accurate physics. We apply this technique to a five-link planar walking mechanism to demonstrate its feasibility and performance. In addition, to see whether results from simulation can be acceptably transferred to real robot hardware, the best evolved CPG network is also tested on a real mechanism. Our results also confirm that the biologically inspired CPG model is well suited for legged locomotion, since a diverse manifestation of networks have been observed to succeed in fitness simulations during evolution.Comment: 11 pages, 9 figures; substantial revision of content, organization, and quantitative result

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Chalmers Research