2,302 research outputs found
Evolving a Behavioral Repertoire for a Walking Robot
Numerous algorithms have been proposed to allow legged robots to learn to
walk. However, the vast majority of these algorithms is devised to learn to
walk in a straight line, which is not sufficient to accomplish any real-world
mission. Here we introduce the Transferability-based Behavioral Repertoire
Evolution algorithm (TBR-Evolution), a novel evolutionary algorithm that
simultaneously discovers several hundreds of simple walking controllers, one
for each possible direction. By taking advantage of solutions that are usually
discarded by evolutionary processes, TBR-Evolution is substantially faster than
independently evolving each controller. Our technique relies on two methods:
(1) novelty search with local competition, which searches for both
high-performing and diverse solutions, and (2) the transferability approach,
which com-bines simulations and real tests to evolve controllers for a physical
robot. We evaluate this new technique on a hexapod robot. Results show that
with only a few dozen short experiments performed on the robot, the algorithm
learns a repertoire of con-trollers that allows the robot to reach every point
in its reachable space. Overall, TBR-Evolution opens a new kind of learning
algorithm that simultaneously optimizes all the achievable behaviors of a
robot.Comment: 33 pages; Evolutionary Computation Journal 201
Evolving a Behavioral Repertoire for a Walking Robot
International audienceNumerous algorithms have been proposed to allow legged robots to learn to walk. However, the vast majority of these algorithms is devised to learn to walk in a straight line, which is not sufficient to accomplish any real-world mission. Here we introduce the Transferability-based Behavioral Repertoire Evolution algorithm (TBR-Evolution), a novel evolutionary algorithm that simultaneously discovers several hundreds of sim-ple walking controllers, one for each possible direction. By taking advantage of so-lutions that are usually discarded by evolutionary processes, TBR-Evolution is sub-stantially faster than independently evolving each controller. Our technique relies on two methods: (1) novelty search with local competition, which searches for both high-performing and diverse solutions, and (2) the transferability approach, which com-bines simulations and real tests to evolve controllers for a physical robot. We evaluate this new technique on a hexapod robot. Results show that with only a few dozen short experiments performed on the robot, the algorithm learns a repertoire of con-trollers that allows the robot to reach every point in its reachable space. Overall, TBR-Evolution opens a new kind of learning algorithm that simultaneously optimizes all the achievable behaviors of a robot
Behavioral repertoire learning in robotics
Behavioral Repertoire Learning in Robotics Antoine Cully ISIR, Université Pierre et Marie Curie-Paris 6, CNRS UMR 7222 4 place Jussieu, F-75252, Paris Cedex 05, France [email protected] Jean-Baptiste Mouret ISIR, Université Pierre et Marie Curie-Paris 6, CNRS UMR 7222 4 place Jussieu, F-75252, Paris Cedex 05, France [email protected] ABSTRACT Learning in robotics typically involves choosing a simple goal (e.g. walking) and assessing the performance of each con- troller with regard to this task (e.g. walking speed). How- ever, learning advanced, input-driven controllers (e.g. walk- ing in each direction) requires testing each controller on a large sample of the possible input signals. This costly pro- cess makes difficult to learn useful low-level controllers in robotics. Here we introduce BR-Evolution, a new evolutionary learn- ing technique that generates a behavioral repertoire by tak- ing advantage of the candidate solutions that are usually discarded. Instead of evolving a single, general controller, BR-evolution thus evolves a collection of simple controllers, one for each variant of the target behavior; to distinguish similar controllers, it uses a performance objective that al- lows it to produce a collection of diverse but high-performing behaviors. We evaluated this new technique by evolving gait controllers for a simulated hexapod robot. Results show that a single run of the EA quickly finds a collection of controllers that allows the robot to reach each point of the reachable space. Overall, BR-Evolution opens a new kind of learning algorithm that simultaneously optimizes all the achievable behaviors of a robot
Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm
The recently introduced Multi-dimensional Archive of Phenotypic Elites
(MAP-Elites) is an evolutionary algorithm capable of producing a large archive
of diverse, high-performing solutions in a single run. It works by discretizing
a continuous feature space into unique regions according to the desired
discretization per dimension. While simple, this algorithm has a main drawback:
it cannot scale to high-dimensional feature spaces since the number of regions
increase exponentially with the number of dimensions. In this paper, we address
this limitation by introducing a simple extension of MAP-Elites that has a
constant, pre-defined number of regions irrespective of the dimensionality of
the feature space. Our main insight is that methods from computational geometry
could partition a high-dimensional space into well-spread geometric regions. In
particular, our algorithm uses a centroidal Voronoi tessellation (CVT) to
divide the feature space into a desired number of regions; it then places every
generated individual in its closest region, replacing a less fit one if the
region is already occupied. We demonstrate the effectiveness of the new
"CVT-MAP-Elites" algorithm in high-dimensional feature spaces through
comparisons against MAP-Elites in maze navigation and hexapod locomotion tasks
Reset-free Trial-and-Error Learning for Robot Damage Recovery
The high probability of hardware failures prevents many advanced robots
(e.g., legged robots) from being confidently deployed in real-world situations
(e.g., post-disaster rescue). Instead of attempting to diagnose the failures,
robots could adapt by trial-and-error in order to be able to complete their
tasks. In this situation, damage recovery can be seen as a Reinforcement
Learning (RL) problem. However, the best RL algorithms for robotics require the
robot and the environment to be reset to an initial state after each episode,
that is, the robot is not learning autonomously. In addition, most of the RL
methods for robotics do not scale well with complex robots (e.g., walking
robots) and either cannot be used at all or take too long to converge to a
solution (e.g., hours of learning). In this paper, we introduce a novel
learning algorithm called "Reset-free Trial-and-Error" (RTE) that (1) breaks
the complexity by pre-generating hundreds of possible behaviors with a dynamics
simulator of the intact robot, and (2) allows complex robots to quickly recover
from damage while completing their tasks and taking the environment into
account. We evaluate our algorithm on a simulated wheeled robot, a simulated
six-legged robot, and a real six-legged walking robot that are damaged in
several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and
whose objective is to reach a sequence of targets in an arena. Our experiments
show that the robots can recover most of their locomotion abilities in an
environment with obstacles, and without any human intervention.Comment: 18 pages, 16 figures, 3 tables, 6 pseudocodes/algorithms, video at
https://youtu.be/IqtyHFrb3BU, code at
https://github.com/resibots/chatzilygeroudis_2018_rt
Evolvability signatures of generative encodings: beyond standard performance benchmarks
Evolutionary robotics is a promising approach to autonomously synthesize
machines with abilities that resemble those of animals, but the field suffers
from a lack of strong foundations. In particular, evolutionary systems are
currently assessed solely by the fitness score their evolved artifacts can
achieve for a specific task, whereas such fitness-based comparisons provide
limited insights about how the same system would evaluate on different tasks,
and its adaptive capabilities to respond to changes in fitness (e.g., from
damages to the machine, or in new situations). To counter these limitations, we
introduce the concept of "evolvability signatures", which picture the
post-mutation statistical distribution of both behavior diversity (how
different are the robot behaviors after a mutation?) and fitness values (how
different is the fitness after a mutation?). We tested the relevance of this
concept by evolving controllers for hexapod robot locomotion using five
different genotype-to-phenotype mappings (direct encoding, generative encoding
of open-loop and closed-loop central pattern generators, generative encoding of
neural networks, and single-unit pattern generators (SUPG)). We observed a
predictive relationship between the evolvability signature of each encoding and
the number of generations required by hexapods to adapt from incurred damages.
Our study also reveals that, across the five investigated encodings, the SUPG
scheme achieved the best evolvability signature, and was always foremost in
recovering an effective gait following robot damages. Overall, our evolvability
signatures neatly complement existing task-performance benchmarks, and pave the
way for stronger foundations for research in evolutionary robotics.Comment: 24 pages with 12 figures in the main text, and 4 supplementary
figures. Accepted at Information Sciences journal (in press). Supplemental
videos are available online at, see http://goo.gl/uyY1R
Evolved embodied phase coordination enables robust quadruped robot locomotion
Overcoming robotics challenges in the real world requires resilient control
systems capable of handling a multitude of environments and unforeseen events.
Evolutionary optimization using simulations is a promising way to automatically
design such control systems, however, if the disparity between simulation and
the real world becomes too large, the optimization process may result in
dysfunctional real-world behaviors. In this paper, we address this challenge by
considering embodied phase coordination in the evolutionary optimization of a
quadruped robot controller based on central pattern generators. With this
method, leg phases, and indirectly also inter-leg coordination, are influenced
by sensor feedback.By comparing two very similar control systems we gain
insight into how the sensory feedback approach affects the evolved parameters
of the control system, and how the performances differs in simulation, in
transferal to the real world, and to different real-world environments. We show
that evolution enables the design of a control system with embodied phase
coordination which is more complex than previously seen approaches, and that
this system is capable of controlling a real-world multi-jointed quadruped
robot.The approach reduces the performance discrepancy between simulation and
the real world, and displays robustness towards new environments.Comment: 9 page
An approach to evolve and exploit repertoires of general robot behaviours
Recent works in evolutionary robotics have shown the viability of evolution driven by behavioural novelty and diversity. These evolutionary approaches have been successfully used to generate repertoires of diverse and high-quality behaviours, instead of driving evolution towards a single, task-specific solution. Having repertoires of behaviours can enable new forms of robotic control, in which high-level controllers continually decide which behaviour to execute. To date, however, only the use of repertoires of open-loop locomotion primitives has been studied. We propose EvoRBC-II, an approach that enables the evolution of repertoires composed of general closed-loop behaviours, that can respond to the robot's sensory inputs. The evolved repertoire is then used as a basis to evolve a transparent higher-level controller that decides when and which behaviours of the repertoire to execute. Relying on experiments in a simulated domain, we show that the evolved repertoires are composed of highly diverse and useful behaviours. The same repertoire contains sufficiently diverse behaviours to solve a wide range of tasks, and the EvoRBC-II approach can yield a performance that is comparable to the standard tabula-rasa evolution. EvoRBC-II enables automatic generation of hierarchical control through a two-step evolutionary process, thus opening doors for the further exploration of the advantages that can be brought by hierarchical control.info:eu-repo/semantics/acceptedVersio
- …