Search CORE

7 research outputs found

Synergy-based policy improvement with path integrals for anthropomorphic hands

Author: Ficuciello Fanny
Siciliano Bruno
Zaccara Damiano
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this work, a synergy-based reinforcement learning algorithm has been developed to confer autonomous grasping capabilities to anthropomorphic hands. In the presence of high degrees of freedom, classical machine learning techniques require a number of iterations that increases with the size of the problem, thus convergence of the solution is not ensured. The use of postural synergies determines dimensionality reduction of the search space and allows recent learning techniques, such as Policy Improvement with Path Integrals, to become easily applicable. A key point is the adoption of a suitable reward function representing the goal of the task and ensuring onestep performance evaluation. Force-closure quality of the grasp in the synergies subspace has been chosen as a cost function for performance evaluation. The experiments conducted on the SCHUNK 5-Finger Hand demonstrate the effectiveness of the algorithm showing skills comparable to human capabilities in learning new grasps and in performing a wide variety from power to high precision grasps of very small objects

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

Proximodistal Exploration in Motor Learning as an Emergent Property of Optimization

Author: Adolph
Baranes
Baranes
Bernstein
Bertenthal
Berthier
Berthier
Berthouze
Bongard
Calvin
Cangelosi
Changeux
Chernova
Cohen
Edelman
Elman
Fernando
Fernando
Fernando
French
Gottlieb
Grupen
Hansen
Heidrich-Meisner
Hodges
Hoffmann
Jansen
Kober
Kuypers
Lee
Matos
Miard
Moulin-Frier
Nagai
Oudeyer
Oudeyer
Rovee-Collier
Rubinstein
Sakoe
Saltelli
Schlesinger
Southard
Stulp
Stulp
Stulp
Stulp
Stulp
Thelen
Theodorou
Todorov
Uchibe
Vereijken
Vijayakumar
Westermann
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 14/12/2017
Field of study

International audienceTo harness the complexity of their high-dimensional bodies during sensorimotor development , infants are guided by patterns of freezing and freeing of degrees of freedom. For instance, when learning to reach, infants free the degrees of freedom in their arm proximodis-tally, i.e. from joints that are closer to the body to those that are more distant. Here, we formulate and study computationally the hypothesis that such patterns can emerge spontaneously as the result of a family of stochastic optimization processes (evolution strategies with covariance-matrix adaptation), without an innate encoding of a maturational schedule. In particular, we present simulated experiments with an arm where a computational learner progressively acquires reaching skills through adaptive exploration, and we show that a proximodistal organization appears spontaneously, which we denote PDFF (ProximoDistal Freezing and Freeing of degrees of freedom). We also compare this emergent organization between different arm morphologies – from human-like to quite unnatural ones – to study the effect of different kinematic structures on the emergence of PDFF. Research highlights. • We propose a general, domain-independent hypothesis for the developmental organization of freezing and freeing of degrees of freedom observed both in infant development and adult skill acquisition, such as proximo-distal exploration in learning to reach

Institute of Transport Research:Publications

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Controllo bioispirato di una mano robotica per compiti di manipolazione ciclica

Author: Meola Valentina
Publication venue
Publication date
Field of study

No abstract availabl

PUblication MAnagement

Reinforcement learning to improve 4-finger-gripper manipulation

Author: Ojer de Andrés Marco
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2017
Field of study

In the framework of robotics, Reinforcement Learning (RL) deals with the learning of a task by the robot itself. This work focuses on a recently developed method, Policy Improvement with Path Integrals (PI2), for the case of a 4-finger-gripper manipulator to perform the task of rotating a ball around a desired axis. The scope of the thesis is to design an experiment, in which the algorithm receives feedback of robot performance. The algorithm has also been adapted to cope with periodic movements parametrized as motor primitives. Furthermore, due to the high dimensionality of the problem, certain assumptions have been made in order to limit the state-space to a reliable subset of it. The obtained results illustrate the good performance of the algorithm as the robot is able to perform the task focusing on important aspects previously set by the user, both for simulation and also for the real robot. The main bottleneck of the thesis has been the speed of both software and hardware, as much time was required to perform long run experiments, specifically in the implementation on the robot where manual supervision was needed

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Agent-based self-organisation for task allocation:reinforcement learning for emergent multi-agent systems

Author: Creech Niall
Publication venue
Publication date: 01/06/2023
Field of study

King's Research Portal

Adaptive exploration for continual reinforcement learning

Author: Freek Stulp
Publication venue
Publication date: 01/01/2012
Field of study

Abstract — Most experiments on policy search for robotics focus on isolated tasks, where the experiment is split into two distinct phases: 1) the learning phase, where the robot learns the task through exploration; 2) the exploitation phase, where exploration is turned off, and the robot demonstrates its performance on the task it has learned. In this paper, we present an algorithm that enables robots to continually and autonomously alternate between these phases. We do so by combining the ‘Policy Improvement with Path Integrals ’ direct reinforcement learning algorithm with the covariance matrix adaptation rule from the ‘Cross-Entropy Method ’ optimization algorithm. This integration is possible because both algorithms iteratively update parameters with probability-weighted averaging. A practical advantage of the novel algorithm, called PI 2-CMA, is that it alleviates the user from having to manually tune the degree of exploration. We evaluate PI 2-CMA’s ability to continually and autonomously tune exploration on two tasks. I

CiteSeerX

Crossref

INRIA a CCSD electronic archive server