50 research outputs found
Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space
We focus on the challenge of finding a diverse collection of quality
solutions on complex continuous domains. While quality diver-sity (QD)
algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are
designed to generate a diverse range of solutions, these algorithms require a
large number of evaluations for exploration of continuous spaces. Meanwhile,
variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are
among the best-performing derivative-free optimizers in single-objective
continuous domains. This paper proposes a new QD algorithm called Covariance
Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the
self-adaptation techniques of CMA-ES with archiving and mapping techniques for
maintaining diversity in QD. Results from experiments based on standard
continuous optimization benchmarks show that CMA-ME finds better-quality
solutions than MAP-Elites; similarly, results on the strategic game Hearthstone
show that CMA-ME finds both a higher overall quality and broader diversity of
strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles
the performance of MAP-Elites using standard QD performance metrics. These
results suggest that QD algorithms augmented by operators from state-of-the-art
optimization algorithms can yield high-performing methods for simultaneously
exploring and optimizing continuous search spaces, with significant
applications to design, testing, and reinforcement learning among other
domains.Comment: Accepted to GECCO 202
Analysis of gameplay strategies in hearthstone: a data science approach
In recent years, games have been a popular test bed for AI research, and the presence of Collectible Card Games (CCGs) in that space is still increasing. One such CCG for both competitive/casual play and AI research is Hearthstone, a two-player adversarial game where players seeks to implement one of several gameplay strategies to defeat their opponent and decrease all of their Health points to zero. Although some open source simulators exist, some of their methodologies for simulated agents create opponents with a relatively low skill level. Using evolutionary algorithms, this thesis seeks to evolve agents with a higher skill level than those implemented in one such simulator, SabberStone. New benchmarks are propsed using supervised learning techniques to predict gameplay strategies from game data, and using unsupervised learning techniques to discover and visualize patterns that may be used in player modeling to differentiate gameplay strategies
Multi-Robot Coordination and Layout Design for Automated Warehousing
With the rapid progress in Multi-Agent Path Finding (MAPF), researchers have
studied how MAPF algorithms can be deployed to coordinate hundreds of robots in
large automated warehouses. While most works try to improve the throughput of
such warehouses by developing better MAPF algorithms, we focus on improving the
throughput by optimizing the warehouse layout. We show that, even with
state-of-the-art MAPF algorithms, commonly used human-designed layouts can lead
to congestion for warehouses with large numbers of robots and thus have limited
scalability. We extend existing automatic scenario generation methods to
optimize warehouse layouts. Results show that our optimized warehouse layouts
(1) reduce traffic congestion and thus improve throughput, (2) improve the
scalability of the automated warehouses by doubling the number of robots in
some cases, and (3) are capable of generating layouts with user-specified
diversity measures. We include the source code at:
https://github.com/lunjohnzhang/warehouse_env_gen_publicComment: Accepted to International Joint Conference on Artificial Intelligence
(IJCAI), 2023. The paper can be found at IJCAI 2023 proceeding at
https://www.ijcai.org/proceedings/2023/061
Artificial intelligence in co-operative games with partial observability
This thesis investigates Artificial Intelligence in co-operative games that feature Partial Observability. Most video games feature a combination of both co-operation, as well as Partial Observability. Co-operative games are games that feature a team of at least two agents, that must achieve a shared goal of some kind. Partial Observability is the restriction of how much of an environment that an agent can observe. The research performed in this thesis examines the challenge of creating Artificial Intelligence for co-operative games that feature Partial Observability. The main contributions are that Monte-Carlo Tree Search outperforms Genetic Algorithm based agents in solving co-operative problems without communication, the creation of a co-operative Partial Observability competition promoting Artificial Intelligence research as well as an investigation of the effect of varying Partial Observability to Artificial Intelligence, and finally the creation of a high performing Monte-Carlo Tree Search agent for the game Hanabi that uses agent modelling to rationalise about other players
Arbitrarily Scalable Environment Generators via Neural Cellular Automata
We study the problem of generating arbitrarily large environments to improve
the throughput of multi-robot systems. Prior work proposes Quality Diversity
(QD) algorithms as an effective method for optimizing the environments of
automated warehouses. However, these approaches optimize only relatively small
environments, falling short when it comes to replicating real-world warehouse
sizes. The challenge arises from the exponential increase in the search space
as the environment size increases. Additionally, the previous methods have only
been tested with up to 350 robots in simulations, while practical warehouses
could host thousands of robots. In this paper, instead of optimizing
environments, we propose to optimize Neural Cellular Automata (NCA) environment
generators via QD algorithms. We train a collection of NCA generators with QD
algorithms in small environments and then generate arbitrarily large
environments from the generators at test time. We show that NCA environment
generators maintain consistent, regularized patterns regardless of environment
size, significantly enhancing the scalability of multi-robot systems in two
different domains with up to 2,350 robots. Additionally, we demonstrate that
our method scales a single-agent reinforcement learning policy to arbitrarily
large environments with similar patterns. We include the source code at
\url{https://github.com/lunjohnzhang/warehouse_env_gen_nca_public}.Comment: Accepted to Advances in Neural Information Processing Systems
(NeurIPS), 202