5 research outputs found

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Manifolds & Memory: Improving the Search Speed of Evolutionary Algorithms

    Get PDF
    Evolutionary Algorithms (EA) are a set of algorithms inspired by Darwin’s theory of Natural Selection that are well equipped to perform a wide variety of optimisation tasks. Due to their use as a derivative-free continuous value optimisation algorithm, EAs are often compared to gradient based optimisation techniques, such as stochastic gradient descent (SGD). However, EAs are generally deemed subpar to gradient based techniques, evidenced by the fact that none of the most commonly used Deep Learning frameworks implement EAs as a neural network optimisation algorithm, and that the majority of neural networks are optimised using gradient based techniques. Nevertheless, despite often cited as being too slow to optimise large parameter spaces, such as large neural networks, numerous recent works have shown that EAs can outperform gradient based techniques at reinforcement learning (RL) control tasks. The aim of this work is to add more credence to the claim that EAs are a competitive technique for real valued optimisation by demonstrating how the search speed of EAs can be increased. We achieve this using two distinct techniques. Firstly, knowledge from the optimisation of a set of source problems is reused to improve search performance on a set of unseen, target problems. This reuse of knowledge is achieved by embedding information with respect to the location of high fitness solutions in an indirect encoding (IE). In this thesis, we learn an IE by training generative models to model the distribution of previously located solutions to a set of source problems. We subsequently perform evolutionary search within the latent space of the generative part of the model on various target problems from the same ‘family’ as the source problems. We perform the first comparative analysis of IEs derived from autoencoders, variational autoencoders (VAE), and generative adversarial networks (GAN) for the optimisation of continuous functions. We also demonstrate for the first time how these techniques can be utilised to perform transfer learning on RL control tasks. We show that all three types of IE outperform direct encoding (DE) baselines on one or more of the problems considered. We also perform an in-depth analysis into the behaviour of each IE type, which allows us to suggest remediations to some of the pathologies discovered. The second technique explored is a modification to an existing neuroevolutionary (the evolution of neural networks) algorithm, NEAT. NEAT is a topology and weight evolving artificial neural network, meaning that both the weights and the architecture of the neural network are optimised simultaneously. Although the original NEAT algorithm includes recurrent connections, they typically have trouble memorising information over long time horizons. Therefore, we introduce a novel algorithm, NEAT-GRU, that is capable of mutating gated recurrent units (GRU) into the network. We show that NEAT-GRU outperforms NEAT and hand coded baselines at generalised maze solving tasks. We also show that NEAT-GRU is the only algorithm tested that can locate solutions for a much harder navigational task where the bearing (relative angle) towards the target is not provided to the agent. Overall we have introduced two novel techniques that have successfully achieved an increase in EA search speed, further attesting to their competitiveness compared to gradient based techniques
    corecore