989 research outputs found

    Evolution of Control Programs for a Swarm of Autonomous Unmanned Aerial Vehicles

    Get PDF
    Unmanned aerial vehicles (UAVs) are rapidly becoming a critical military asset. In the future, advances in miniaturization are going to drive the development of insect size UAVs. New approaches to controlling these swarms are required. The goal of this research is to develop a controller to direct a swarm of UAVs in accomplishing a given mission. While previous efforts have largely been limited to a two-dimensional model, a three-dimensional model has been developed for this project. Models of UAV capabilities including sensors, actuators and communications are presented. Genetic programming uses the principles of Darwinian evolution to generate computer programs to solve problems. A genetic programming approach is used to evolve control programs for UAV swarms. Evolved controllers are compared with a hand-crafted solution using quantitative and qualitative methods. Visualization and statistical methods are used to analyze solutions. Results indicate that genetic programming is capable of producing effective solutions to multi-objective control problems

    Many-agent Reinforcement Learning

    Get PDF
    Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (N≫2N \gg 2), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm -- αα\alpha^\alpha-Rank -- in many-agent systems. The critical advantage of αα\alpha^\alpha-Rank is that it can compute the solution concept of α\alpha-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be PPADPPAD-hard in even two-player cases. αα\alpha^\alpha-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

    Evolution of fuzzy animats in a competitive environment

    Get PDF
    Collective behaviour is a fascinating field that studies coordinated motion of large groups of similar entities. Probably the most common hypothesis about the origins of collective animal behaviour suggests that it might function as a defensive mechanism against predation. In this thesis we used various computational techniques to study this hypothesis. We started by expanding an existing fuzzy model for the computer simulation of bird flocking with predators and visual perception. We implemented three target selection tactics that take into account the visual perspective of the predator (attack the nearest visible individual, attack the most visually isolated individual, and attack the centre of the visible group). Our results suggest that for prey individuals social behaviour (governed by the separation, alignment and cohesion drives) as opposed to individualistic (governed exclusively by the separation drive) is the most beneficial (predators take longer to capture their target). Predators, on the other hand, capture social prey individuals quicker when they attack the most visually isolated individual, but capture individualistic prey faster if they focus on the nearest prey individual. In the next stage we developed an evolutionary model for tuning hand-crafted composite predator attack/target selection tactics. For reasons of computational simplicity we here expanded on a known mathematical model of prey collective behaviour. This allowed us to concentrate on predator target selection tactics. We investigated the evolution of the optimal tactic with respect to prey behaving collectively and prey that performed a delayed response. With the latter prey individuals instead of responding immediately at the first sight of the predator delay the response to a later point in time and then try to outsmart the predator by performing rapid twists and turns. This might be an advantageous defensive manoeuvre because prey can remain in a compact group for as long as possible and because prey individuals are usually smaller than predators and as such have a higher turn rate. Our results suggest that a composite tactic termed dispersing tactic, where the predator first dives deep into the group of prey and then targets the most peripheral individual, is the best tactic. Experiments with prey's delayed response suggest that prey individuals can indeed increase their survivability by using this defensive manoeuvre and the dispersing tactics seems to be the only tactic capable of at least partially diminishing the effectiveness of the preys' delayed response. This was a clear indication of potential interplay between target selection tactics and prey behaviour. Armed with this knowledge, we developed an artificial life-like open-ended evolutionary model, where the behaviour of prey and predator individuals is governed by fuzzy logic. In this model we focused on the evolution of prey behaviour when prey individuals face different predation tactics. We demonstrated that in this model prey individuals evolve different types of collective behaviour (swarm, milling, polarized, dynamic). Interestingly, the analysis of the evolved rule bases showed a statistically significant difference between different types of behaviour in the proportion of rules that take into account predator related information. This suggested that the predation pressures the prey are subject to during evolution might have an influence on the behaviour that evolves. Our last step of research was thus a controlled experiment where prey evolve under various predation tactics. Here we let prey evolve under four predation tactics, two of which according to previous research pressure prey to evolve dispersing and two pressure prey to evolve grouping. Our results suggest that antagonism in predation pressures, where prey are exposed to predation pressures for which the best response is both grouping and dispersing simultaneously, might be necessary for prey to evolve polarized movement
    • …
    corecore