442 research outputs found
Evolutionary online behaviour learning and adaptation in real robots
Online evolution of behavioural control on real robots is an open-ended approach to autonomous learning and adaptation: robots have the potential to automatically learn new tasks and to adapt to changes in environmental conditions, or to failures in sensors and/or actuators. However, studies have so far almost exclusively been carried out in simulation because evolution in real hardware has required several days or weeks to produce capable robots. In this article, we successfully evolve neural network-based controllers in real robotic hardware to solve two single-robot tasks and one collective robotics task. Controllers are evolved either from random solutions or from solutions pre-evolved in simulation. In all cases, capable solutions are found in a timely manner (1 h or less). Results show that more accurate simulations may lead to higher-performing controllers, and that completing the optimization process in real robots is meaningful, even if solutions found in simulation differ from solutions in reality. We furthermore demonstrate for the first time the adaptive capabilities of online evolution in real robotic hardware, including robots able to overcome faults injected in the motors of multiple units simultaneously, and to modify their behaviour in response to changes in the task requirements. We conclude by assessing the contribution of each algorithmic component on the performance of the underlying evolutionary algorithm.info:eu-repo/semantics/publishedVersio
Evolutionary strategies in swarm robotics controllers
Nowadays, Unmanned Vehicles (UV) are widespread around the world. Most of these
vehicles require a great level of human control, and mission success is reliant on this
dependency. Therefore, it is important to use machine learning techniques that will train the
robotic controllers to automate the control, making the process more efficient.
Evolutionary strategies may be the key to having robust and adaptive learning in robotic
systems. Many studies involving UV systems and evolutionary strategies have been
conducted in the last years, however, there are still research gaps that need to be addressed,
such as the reality gap. The reality gap occurs when controllers trained in simulated
environments fail to be transferred to real robots.
This work proposes an approach for solving robotic tasks using realistic simulation and
using evolutionary strategies to train controllers. The chosen setup is easily scalable for multirobot
systems or swarm robots.
In this thesis, the simulation architecture and setup are presented, including the drone
simulation model and software. The drone model chosen for the simulations is available in the
real world and widely used, such as the software and flight control unit. This relevant factor
makes the transition to reality smoother and easier. Controllers using behavior trees were
evolved using a developed evolutionary algorithm, and several experiments were conducted.
Results demonstrated that it is possible to evolve a robotic controller in realistic
simulation environments, using a simulated drone model that exists in the real world, and also
the same flight control unit and operating system that is generally used in real world
experiments.Atualmente os Veículos Não Tripulados (VNT) encontram-se difundidos por todo o Mundo.
A maioria destes veículos requerem um elevado controlo humano, e o sucesso das missões
está diretamente dependente deste fator. Assim, é importante utilizar técnicas de
aprendizagem automática que irão treinar os controladores dos VNT, de modo a automatizar o
controlo, tornando o processo mais eficiente.
As estratégias evolutivas podem ser a chave para uma aprendizagem robusta e adaptativa
em sistemas robóticos. Vários estudos têm sido realizados nos últimos anos, contudo, existem
lacunas que precisam de ser abordadas, tais como o reality gap. Este facto ocorre quando os
controladores treinados em ambientes simulados falham ao serem transferidos para VNT
reais.
Este trabalho propõe uma abordagem para a resolução de missões com VNT, utilizando
um simulador realista e estratégias evolutivas para treinar controladores. A arquitetura
escolhida é facilmente escalável para sistemas com múltiplos VNT.
Nesta tese, é apresentada a arquitetura e configuração do ambiente de simulação,
incluindo o modelo e software de simulação do VNT. O modelo de VNT escolhido para as
simulações é um modelo real e amplamente utilizado, assim como o software e a unidade de
controlo de voo. Este fator é relevante e torna a transição para a realidade mais suave. É
desenvolvido um algoritmo evolucionário para treinar um controlador, que utiliza behavior
trees, e realizados diversos testes.
Os resultados demonstram que é possível evoluir um controlador em ambientes de
simulação realistas, utilizando um VNT simulado mas real, assim como utilizando as mesmas
unidades de controlo de voo e software que são amplamente utilizados em ambiente real
The Role of Environmental and Controller Complexity in the Distributed Optimization of Multi-Robot Obstacle Avoidance
The ability to move in complex environments is a fundamental requirement for robots to be a part of our daily lives. Increasing the controller complexity may be a desirable choice in order to obtain an improved performance. However, these two aspects may pose a considerable challenge on the optimization of robotic controllers. In this paper, we study the trade-offs between the complexity of reactive controllers and the complexity of the environment in the optimization of multi-robot obstacle avoidance for resource-constrained platforms. The optimization is carried out in simulation using a distributed, noise-resistant implementation of Particle Swarm Optimization, and the resulting controllers are evaluated both in simulation and with real robots. We show that in a simple environment, linear controllers with only two parameters perform similarly to more complex non-linear controllers with up to twenty parameters, even though the latter ones require more evaluation time to be learned. In a more complicated environment, we show that there is an increase in performance when the controllers can differentiate between front and backwards sensors, but increasing further the number of sensors and adding non-linear activation functions provide no further benefit. In both environments, augmenting reactive control laws with simple memory capabilities causes the highest increase in performance. We also show that in the complex environment the performance measurements are noisier, the optimal parameter region is smaller, and more iterations are required for the optimization process to converge
Noise-Resistant Particle Swarm Optimization for the Learning of Robust Obstacle Avoidance Controllers using a Depth Camera
The Ranger robot was designed to interact with children in order to motivate them to tidy up their room. Its mechanical configuration, together with the limited field of view of its depth camera, make the learning of obstacle avoidance behaviors a hard problem. In this article we introduce two new Particle Swarm Optimization (PSO) algorithms designed to address this noisy, high-dimensional optimization problem. Their aim is to increase the robustness of the generated robotic controllers, as compared to previous PSO algorithms. We show that we can successfully apply this set of PSO algorithms to learn 166 parameters of a robotic controller for the obstacle avoidance task. We also study the impact that an increased evaluation budget has on the robustness and average performance of the optimized controllers. Finally, we validate the control solutions learned in simulation by testing the most robust controller in three different real arenas
A Comparison of PSO and Reinforcement Learning for Multi-Robot Obstacle Avoidance
The design of high-performing robotic controllers constitutes an example of expensive optimization in uncertain environments due to the often large parameter space and noisy performance metrics. There are several evaluative techniques that can be employed for on-line controller design. Adequate benchmarks help in the choice of the right algorithm in terms of final performance and evaluation time. In this paper, we use multi-robot obstacle avoidance as a benchmark to compare two different evaluative learning techniques: Particle Swarm Optimization and Q-learning. For Q-learning, we implement two different approaches: one with discrete states and discrete actions, and another one with discrete actions but a continuous state space. We show that continuous PSO has the highest fitness overall, and Q-learning with continuous states performs significantly better than Q-learning with discrete states. We also show that in the single robot case, PSO and Q-learning with discrete states require a similar amount of total learning time to converge, while the time required with Q-learning with continuous states is significantly larger. In the multi-robot case, both Q-learning approaches require a similar amount of time as in the single robot case, but the time required by PSO can be significantly reduced due to the distributed nature of the algorithm
Engineering evolutionary control for real-world robotic systems
Evolutionary Robotics (ER) is the field of study concerned with the application
of evolutionary computation to the design of robotic systems. Two main
issues have prevented ER from being applied to real-world tasks, namely scaling to
complex tasks and the transfer of control to real-robot systems. Finding solutions
to complex tasks is challenging for evolutionary approaches due to the bootstrap
problem and deception. When the task goal is too difficult, the evolutionary process
will drift in regions of the search space with equally low levels of performance
and therefore fail to bootstrap. Furthermore, the search space tends to get rugged
(deceptive) as task complexity increases, which can lead to premature convergence.
Another prominent issue in ER is the reality gap. Behavioral control is typically
evolved in simulation and then only transferred to the real robotic hardware when
a good solution has been found. Since simulation is an abstraction of the real
world, the accuracy of the robot model and its interactions with the environment
is limited. As a result, control evolved in a simulator tends to display a lower
performance in reality than in simulation.
In this thesis, we present a hierarchical control synthesis approach that enables
the use of ER techniques for complex tasks in real robotic hardware by mitigating
the bootstrap problem, deception, and the reality gap. We recursively decompose
a task into sub-tasks, and synthesize control for each sub-task. The individual
behaviors are then composed hierarchically. The possibility of incrementally
transferring control as the controller is composed allows transferability issues to
be addressed locally in the controller hierarchy. Our approach features hybridity,
allowing different control synthesis techniques to be combined. We demonstrate
our approach in a series of tasks that go beyond the complexity of tasks where ER
has been successfully applied. We further show that hierarchical control can be applied
in single-robot systems and in multirobot systems. Given our long-term goal
of enabling the application of ER techniques to real-world tasks, we systematically
validate our approach in real robotic hardware. For one of the demonstrations in
this thesis, we have designed and built a swarm robotic platform, and we show the
first successful transfer of evolved and hierarchical control to a swarm of robots
outside of controlled laboratory conditions.A Robótica Evolutiva (RE) é a área de investigação que estuda a aplicação de
computação evolutiva na conceção de sistemas robóticos. Dois principais desafios
têm impedido a aplicação da RE em tarefas do mundo real: a dificuldade em solucionar
tarefas complexas e a transferência de controladores evoluídos para sistemas
robóticos reais. Encontrar soluções para tarefas complexas é desafiante para as
técnicas evolutivas devido ao bootstrap problem e à deception. Quando o objetivo
é demasiado difícil, o processo evolutivo tende a permanecer em regiões do espaço
de procura com níveis de desempenho igualmente baixos, e consequentemente não
consegue inicializar. Por outro lado, o espaço de procura tende a enrugar à medida
que a complexidade da tarefa aumenta, o que pode resultar numa convergência
prematura. Outro desafio na RE é a reality gap. O controlo robótico é tipicamente
evoluído em simulação, e só é transferido para o sistema robótico real quando uma
boa solução tiver sido encontrada. Como a simulação é uma abstração da realidade,
a precisão do modelo do robô e das suas interações com o ambiente é limitada,
podendo resultar em controladores com um menor desempenho no mundo real.
Nesta tese, apresentamos uma abordagem de síntese de controlo hierárquica
que permite o uso de técnicas de RE em tarefas complexas com hardware robótico
real, mitigando o bootstrap problem, a deception e a reality gap. Decompomos
recursivamente uma tarefa em sub-tarefas, e sintetizamos controlo para cada subtarefa.
Os comportamentos individuais são então compostos hierarquicamente.
A possibilidade de transferir o controlo incrementalmente à medida que o controlador
é composto permite que problemas de transferibilidade possam ser endereçados
localmente na hierarquia do controlador. A nossa abordagem permite
o uso de diferentes técnicas de síntese de controlo, resultando em controladores
híbridos. Demonstramos a nossa abordagem em várias tarefas que vão para além
da complexidade das tarefas onde a RE foi aplicada. Também mostramos que o
controlo hierárquico pode ser aplicado em sistemas de um robô ou sistemas multirobô.
Dado o nosso objetivo de longo prazo de permitir o uso de técnicas de
RE em tarefas no mundo real, concebemos e desenvolvemos uma plataforma de
robótica de enxame, e mostramos a primeira transferência de controlo evoluído e
hierárquico para um exame de robôs fora de condições controladas de laboratório.This work has been supported by the Portuguese Foundation for Science
and Technology (Fundação para a Ciência e Tecnologia) under the grants
SFRH/BD/76438/2011, EXPL/EEI-AUT/0329/2013, and by Instituto de Telecomunicações
under the grant UID/EEA/50008/2013
- …