9 research outputs found

    A comparison of illumination algorithms in unbounded spaces

    Get PDF
    International audienceIllumination algorithms are a new class of evolutionary algorithms capable of producing large archives of diverse and high-performing solutions. Examples of such algorithms include Novelty Search with Local Competition (NSLC), the Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) and the newly introduced Cen-troidal Voronoi Tessellation (CVT) MAP-Elites. While NSLC can be used in unbounded behavioral spaces, MAP-Elites and CVT-MAP-Elites require the user to manually specify the bounds. In this study, we introduce variants of these algorithms that expand their bounds based on the discovered solutions. In addition, we introduce a novel algorithm called "Cluster-Elites" that can adapt its bounds to non-convex spaces. We compare all algorithms in a maze navigation problem and illustrate that Cluster-Elites and the expansive variants of MAP-Elites and CVT-MAP-Elites have comparable or better performance than NSLC, MAP-Elites and CVT-MAP-Elites

    Unified Behavior Framework for Reactive Robot Control

    Get PDF
    Behavior-based systems form the basis of autonomous control for many robots. In this article, we demonstrate that a single software framework can be used to represent many existing behavior based approaches. The unified behavior framework presented, incorporates the critical ideas and concepts of the existing reactive controllers. Additionally, the modular design of the behavior framework: (1) simplifies development and testing; (2) promotes the reuse of code; (3) supports designs that scale easily into large hierarchies while restricting code complexity; and (4) allows the behavior based system developer the freedom to use the behavior system they feel will function the best. When a hybrid or three layer control architecture includes the unified behavior framework, a common interface is shared by all behaviors, leaving the higher order planning and sequencing elements free to interchange behaviors during execution to achieve high level goals and plans. The framework\u27s ability to compose structures from independent elements encourages experimentation and reuse while isolating the scope of troubleshooting to the behavior composition. The ability to use elemental components to build and evaluate behavior structures is demonstrated using the Robocode simulation environment. Additionally, the ability of a reactive controller to change its active behavior during execution is shown in a goal seeking robot implementation

    Diversity-driven selection of exploration strategies in multi-armed bandits

    Get PDF
    International audienceWe consider a scenario where an agent has multiple available strategies to explore an unknown environment. For each new interaction with the environment, the agent must select which exploration strategy to use. We provide a new strategy-agnostic method that treat the situation as a Multi-Armed Bandits problem where the reward signal is the diversity of effects that each strategy produces. We test the method empirically on a simulated planar robotic arm, and establish that the method is both able discriminate between strategies of dissimilar quality, even when the differences are tenuous, and that the resulting performance is competitive with the best fixed mixture of strategies

    Multiagent Learning Through Indirect Encoding

    Get PDF
    Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding

    Quality Diversity: Harnessing Evolution to Generate a Diversity of High-Performing Solutions

    Get PDF
    Evolution in nature has designed countless solutions to innumerable interconnected problems, giving birth to the impressive array of complex modern life observed today. Inspired by this success, the practice of evolutionary computation (EC) abstracts evolution artificially as a search operator to find solutions to problems of interest primarily through the adaptive mechanism of survival of the fittest, where stronger candidates are pursued at the expense of weaker ones until a solution of satisfying quality emerges. At the same time, research in open-ended evolution (OEE) draws different lessons from nature, seeking to identify and recreate processes that lead to the type of perpetual innovation and indefinitely increasing complexity observed in natural evolution. New algorithms in EC such as MAP-Elites and Novelty Search with Local Competition harness the toolkit of evolution for a related purpose: finding as many types of good solutions as possible (rather than merely the single best solution). With the field in its infancy, no empirical studies previously existed comparing these so-called quality diversity (QD) algorithms. This dissertation (1) contains the first extensive and methodical effort to compare different approaches to QD (including both existing published approaches as well as some new methods presented for the first time here) and to understand how they operate to help inform better approaches in the future. It also (2) introduces a new technique for encoding neural networks for evolution with indirect encoding that contain multiple sensory or output modalities. Further, it (3) explores the idea that QD can act as an engine of open-ended discovery by introducing an expressive platform called Voxelbuild where QD algorithms continually evolve robots that stack blocks in new ways. A culminating experiment (4) is presented that investigates evolution in Voxelbuild over a very long timescale. This research thus stands to advance the OEE community\u27s desire to create and understand open-ended systems while also laying the groundwork for QD to realize its potential within EC as a means to automatically generate an endless progression of new content in real-world applications

    Rapid and Thorough Exploration of Low Dimensional Phenotypic Landscapes

    Get PDF
    PhDThis thesis presents two novel algorithms for the evolutionary optimisation of agent populations through divergent search of low dimensional phenotypic landscapes. As the eld of Evolutionary Robotics (ER) develops towards more complex domains, which often involve deception and uncertainty, the promotion of phenotypic diversity has become of increasing interest. Divergent exploration of the phenotypic feature space has been shown to avoid convergence towards local optima and to provide diverse sets of solutions to a given objective. Novelty Search (NS) and the more recent Multi-dimensional Archive of Phenotypic Elites (MAP-Elites), are two state of the art algorithms which utilise divergent phenotypic search. In this thesis, the individual merits and weaknesses of these algorithms are built upon in order to further develop the study of divergent phenotypic search within ER. An observation that the diverse range of individuals produced through the optimisation of novelty will likely contain solutions to multiple independent objectives is utilised to develop Multiple Assessment Directed Novelty Search (MADNS). The MADNS algorithm is introduced as an extension to NS for the simultaneous optimisation of multiple independent objectives, and is shown to become more e ective than NS as the size of the state space increases. The central contribution of this thesis is the introduction of a novel algorithm for rapid and thorough divergent search of low dimensional phenotypic landscapes. The Spatial, Hierarchical, Illuminated NeuroEvolution (SHINE) algorithm di ers from previous divergent search algorithms, in that it utilises a tree structure for the maintenance and selection of potential candidates. Unlike previous approaches, SHINE iteratively focusses upon sparsely visited areas of the phenotypic landscape without the computationally expensive distance comparison required by NS; rather, the sparseness of the area within the landscape where a potential solution resides is inferred through its depth within the tree. Experimental results in a range of domains show that SHINE signi cantly outperforms NS and MAP-Elites in both performance and exploration
    corecore