5,581 research outputs found
Fusing novelty and surprise for evolving robot morphologies
Traditional evolutionary algorithms tend to converge to a single
good solution, which can limit their chance of discovering more
diverse and creative outcomes. Divergent search, on the other hand,
aims to counter convergence to local optima by avoiding selection
pressure towards the objective. Forms of divergent search such as
novelty or surprise search have proven to be beneficial for both
the efficiency and the variety of the solutions obtained in deceptive
tasks. Importantly for this paper, early results in maze navigation
have shown that combining novelty and surprise search yields an
even more effective search strategy due to their orthogonal nature.
Motivated by the largely unexplored potential of coupling novelty
and surprise as a search strategy, in this paper we investigate how
fusing the two can affect the evolution of soft robot morphologies.
We test the capacity of the combined search strategy against objective,
novelty, and surprise search, by comparing their efficiency and
robustness, and the variety of robots they evolve. Our key results
demonstrate that novelty-surprise search is generally more efficient
and robust across eight different resolutions. Further, surprise
search explores the space of robot morphologies more broadly than
any other algorithm examined.peer-reviewe
DISCORL: Continual reinforcement learning via policy distillation: A preprint
International audienceIn multi-task reinforcement learning there are two main challenges: at training time, the ability to learn different policies with a single model; at test time, inferring which of those policies applying without an external signal. In the case of continual reinforcement learning a third challenge arises: learning tasks sequentially without forgetting the previous ones. In this paper, we tackle these challenges by proposing DisCoRL, an approach combining state representation learning and policy distillation. We experiment on a sequence of three simulated 2D navigation tasks with a 3 wheel omni-directional robot. Moreover, we tested our approach's robustness by transferring the final policy into a real life setting. The policy can solve all tasks and automatically infer which one to run
- …