50 research outputs found

    Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space

    Full text link
    We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver-sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains.Comment: Accepted to GECCO 202

    Analysis of gameplay strategies in hearthstone: a data science approach

    Get PDF
    In recent years, games have been a popular test bed for AI research, and the presence of Collectible Card Games (CCGs) in that space is still increasing. One such CCG for both competitive/casual play and AI research is Hearthstone, a two-player adversarial game where players seeks to implement one of several gameplay strategies to defeat their opponent and decrease all of their Health points to zero. Although some open source simulators exist, some of their methodologies for simulated agents create opponents with a relatively low skill level. Using evolutionary algorithms, this thesis seeks to evolve agents with a higher skill level than those implemented in one such simulator, SabberStone. New benchmarks are propsed using supervised learning techniques to predict gameplay strategies from game data, and using unsupervised learning techniques to discover and visualize patterns that may be used in player modeling to differentiate gameplay strategies

    Multi-Robot Coordination and Layout Design for Automated Warehousing

    Full text link
    With the rapid progress in Multi-Agent Path Finding (MAPF), researchers have studied how MAPF algorithms can be deployed to coordinate hundreds of robots in large automated warehouses. While most works try to improve the throughput of such warehouses by developing better MAPF algorithms, we focus on improving the throughput by optimizing the warehouse layout. We show that, even with state-of-the-art MAPF algorithms, commonly used human-designed layouts can lead to congestion for warehouses with large numbers of robots and thus have limited scalability. We extend existing automatic scenario generation methods to optimize warehouse layouts. Results show that our optimized warehouse layouts (1) reduce traffic congestion and thus improve throughput, (2) improve the scalability of the automated warehouses by doubling the number of robots in some cases, and (3) are capable of generating layouts with user-specified diversity measures. We include the source code at: https://github.com/lunjohnzhang/warehouse_env_gen_publicComment: Accepted to International Joint Conference on Artificial Intelligence (IJCAI), 2023. The paper can be found at IJCAI 2023 proceeding at https://www.ijcai.org/proceedings/2023/061

    Artificial intelligence in co-operative games with partial observability

    Get PDF
    This thesis investigates Artificial Intelligence in co-operative games that feature Partial Observability. Most video games feature a combination of both co-operation, as well as Partial Observability. Co-operative games are games that feature a team of at least two agents, that must achieve a shared goal of some kind. Partial Observability is the restriction of how much of an environment that an agent can observe. The research performed in this thesis examines the challenge of creating Artificial Intelligence for co-operative games that feature Partial Observability. The main contributions are that Monte-Carlo Tree Search outperforms Genetic Algorithm based agents in solving co-operative problems without communication, the creation of a co-operative Partial Observability competition promoting Artificial Intelligence research as well as an investigation of the effect of varying Partial Observability to Artificial Intelligence, and finally the creation of a high performing Monte-Carlo Tree Search agent for the game Hanabi that uses agent modelling to rationalise about other players

    Arbitrarily Scalable Environment Generators via Neural Cellular Automata

    Full text link
    We study the problem of generating arbitrarily large environments to improve the throughput of multi-robot systems. Prior work proposes Quality Diversity (QD) algorithms as an effective method for optimizing the environments of automated warehouses. However, these approaches optimize only relatively small environments, falling short when it comes to replicating real-world warehouse sizes. The challenge arises from the exponential increase in the search space as the environment size increases. Additionally, the previous methods have only been tested with up to 350 robots in simulations, while practical warehouses could host thousands of robots. In this paper, instead of optimizing environments, we propose to optimize Neural Cellular Automata (NCA) environment generators via QD algorithms. We train a collection of NCA generators with QD algorithms in small environments and then generate arbitrarily large environments from the generators at test time. We show that NCA environment generators maintain consistent, regularized patterns regardless of environment size, significantly enhancing the scalability of multi-robot systems in two different domains with up to 2,350 robots. Additionally, we demonstrate that our method scales a single-agent reinforcement learning policy to arbitrarily large environments with similar patterns. We include the source code at \url{https://github.com/lunjohnzhang/warehouse_env_gen_nca_public}.Comment: Accepted to Advances in Neural Information Processing Systems (NeurIPS), 202
    corecore