2,366 research outputs found

    Multiagent Learning Through Indirect Encoding

    Get PDF
    Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding

    R-HybrID: Evolution of agent controllers with a hybridisation of indirect and direct encodings

    Get PDF
    Neuroevolution, the optimisation of artificial neural networks (ANNs) through evolutionary computation, is a promising approach to the synthesis of controllers for autonomous agents. Traditional neuroevolution approaches employ direct encodings, which are limited in their ability to evolve complex or large-scale controllers because each ANN parameter is independently optimised. Indirect encodings, on the other hand, facilitate scalability because each gene can be reused multiple times to construct the ANN, but are biased towards regularity and can become ineffective when irregularity is required. To address such limitations, we introduce a novel algorithm called R-HybrID. In R-HybrID, controllers have both indirectly encoded and directly encoded structure. Because the portion of structure following a specific encoding is under evolutionary control, R-HybrID can automatically find an appropriate encoding combination for a given task. We assess the performance of R-HybrID in three tasks: (i) a high-dimensional visual discrimination task that requires geometric principles to be evolved, (ii) a challenging benchmark for modular robotics, and (iii) a memory task that has proven difficult for current algorithms because it requires effectively accumulating neural structure for cognitive behaviour to emerge. Our results show that R-HybrID consistently outperforms three stateof-the-art neuroevolution algorithms, and effectively evolves complex controllers and behaviours.info:eu-repo/semantics/publishedVersio

    Evolvability signatures of generative encodings: beyond standard performance benchmarks

    Full text link
    Evolutionary robotics is a promising approach to autonomously synthesize machines with abilities that resemble those of animals, but the field suffers from a lack of strong foundations. In particular, evolutionary systems are currently assessed solely by the fitness score their evolved artifacts can achieve for a specific task, whereas such fitness-based comparisons provide limited insights about how the same system would evaluate on different tasks, and its adaptive capabilities to respond to changes in fitness (e.g., from damages to the machine, or in new situations). To counter these limitations, we introduce the concept of "evolvability signatures", which picture the post-mutation statistical distribution of both behavior diversity (how different are the robot behaviors after a mutation?) and fitness values (how different is the fitness after a mutation?). We tested the relevance of this concept by evolving controllers for hexapod robot locomotion using five different genotype-to-phenotype mappings (direct encoding, generative encoding of open-loop and closed-loop central pattern generators, generative encoding of neural networks, and single-unit pattern generators (SUPG)). We observed a predictive relationship between the evolvability signature of each encoding and the number of generations required by hexapods to adapt from incurred damages. Our study also reveals that, across the five investigated encodings, the SUPG scheme achieved the best evolvability signature, and was always foremost in recovering an effective gait following robot damages. Overall, our evolvability signatures neatly complement existing task-performance benchmarks, and pave the way for stronger foundations for research in evolutionary robotics.Comment: 24 pages with 12 figures in the main text, and 4 supplementary figures. Accepted at Information Sciences journal (in press). Supplemental videos are available online at, see http://goo.gl/uyY1R

    Probabilistic Meta-Representations Of Neural Networks

    Full text link
    Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. This allows rich correlations between related weights, and can be seen as realizing a function prior with a Bayesian complexity regularizer ensuring simple solutions. We illustrate the resulting meta-representations and representations, elucidating the power of this prior.Comment: presented at UAI 2018 Uncertainty In Deep Learning Workshop (UDL AUG. 2018

    Evolutionary robotics: model or design?

    Get PDF
    In this paper, I review recent work in evolutionary robotics (ER), and discuss the perspectives and future directions of the field. First, I propose to draw a crisp distinction between studies that exploit ER as a design methodology on the one hand, and studies that instead use ER as a modeling tool to better understand phenomena observed in biology. Such a distinction is not always that obvious in the literature, however. It is my conviction that ER would profit from an explicit commitment to one or the other approach. Indeed, I believe that the constraints imposed by the specific approach would guide the experimental design and the analysis of the results obtained, therefore reducing arbitrary choices and promoting the adoption of principled methods that are common practice in the target domain, be it within engineering or the life sciences. Additionally, this would improve dissemination and the impact of ER studies on other disciplines, leading to the establishment of ER as a valid tool either for design or modeling purposes
    • …
    corecore