Search CORE

97 research outputs found

MAP-Elites for noisy domains by adaptive sampling

Author: Justesen Niels
Mouret Jean-Baptiste
Risi Sebastian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

When Are We Done with Games?

Author: Debus Michael S.
Justesen Niels Orsleff
Risi Sebastian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Crossref

The IT University of Copenhagen's Repository

Deep learning for video game playing

Author: Bontrager Philip
Justesen Niels
Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Learning macromanagement in starcraft from replays using deep learning

Author: Justesen Niels
Risi Sebastian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

The real-time strategy game StarCraft has proven to be a challenging environment for artificial intelligence techniques, and as a result, current state-of-the-art solutions consist of numerous hand-crafted modules. In this paper, we show how macromanagement decisions in StarCraft can be learned directly from game replays using deep learning. Neural networks are trained on 789,571 state-action pairs extracted from 2,005 replays of highly skilled players, achieving top-1 and top-3 error rates of 54.6% and 22.9% in predicting the next build action. By integrating the trained network into UAlbertaBot, an open source StarCraft bot, the system can significantly outperform the game's built-in Terran bot, and play competitively against UAlbertaBot with a fixed rush strategy. To our knowledge, this is the first time macromanagement tasks are learned directly from replays in StarCraft. While the best hand-crafted strategies are still the state-of-the-art, the deep network approach is able to express a wide range of different strategies and thus improving the network's performance further with deep reinforcement learning is an immediately promising avenue for future research. Ultimately this approach could lead to strong StarCraft bots that are less reliant on hard-coded strategies.Comment: 8 pages, to appear in the proceedings of the IEEE Conference on Computational Intelligence and Games (CIG 2017

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Automated Curriculum Learning by Rewarding Temporally Rare Events

Author: Justesen Niels
Risi Sebastian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Reward shaping allows reinforcement learning (RL) agents to accelerate learning by receiving additional reward signals. However, these signals can be difficult to design manually, especially for complex RL tasks. We propose a simple and general approach that determines the reward of pre-defined events by their rarity alone. Here events become less rewarding as they are experienced more often, which encourages the agent to continually explore new types of events as it learns. The adaptiveness of this reward function results in a form of automated curriculum learning that does not have to be specified by the experimenter. We demonstrate that this \emph{Rarity of Events} (RoE) approach enables the agent to succeed in challenging VizDoom scenarios without access to the extrinsic reward from the environment. Furthermore, the results demonstrate that RoE learns a more versatile policy that adapts well to critical changes in the environment. Rewarding events based on their rarity could help in many unsolved RL environments that are characterized by sparse extrinsic rewards but a plethora of known event types.Comment: 8 page

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Blood Bowl: The Next Board Game Challenge for AI

Author: Justesen Niels
Risi Sebastian
Togelius Julian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

The IT University of Copenhagen's Repository

Deep Neuroevolution of Recurrent and Discrete World Models

Author: Asai Masataro
Deisenroth Marc
Justesen Niels
Schulman John
van der Maaten Laurens
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Neural architectures inspired by our own human cognitive system, such as the recently introduced world models, have been shown to outperform traditional deep reinforcement learning (RL) methods in a variety of different domains. Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual information processing, memory, and decision-making. However, so far the components of these models have to be trained separately and through a variety of specialized training methods. This paper demonstrates the surprising finding that models with the same precise parts can be instead efficiently trained end-to-end through a genetic algorithm (GA), reaching a comparable performance to the original world model by solving a challenging car racing task. An analysis of the evolved visual and memory system indicates that they include a similar effective representation to the system trained through gradient descent. Additionally, in contrast to gradient descent methods that struggle with discrete variables, GAs also work directly with such representations, opening up opportunities for classical planning in latent space. This paper adds additional evidence on the effectiveness of deep neuroevolution for tasks that require the intricate orchestration of multiple components in complex heterogeneous architectures

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Online evolution for multi-action adversarial games

Author: Justesen Niels Orsleff
Mahlmann Tobias
Togelius Julian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We present Online Evolution, a novel method for playing turn-based multi-action adversarial games. Such games, which include most strategy games, have extremely high branching factors due to each turn having multiple actions. In Online Evolution, an evolutionary algorithm is used to evolve the combination of atomic actions that make up a single move, with a state evaluation function used for fitness. We implement Online Evolution for the turn-based multi-action game Hero Academy and compare it with a standard Monte Carlo Tree Search implementation as well as two types of greedy algorithms. Online Evolution is shown to outperform these methods by a large margin. This shows that evolutionary planning on the level of a single move can be very effective for this sort of problems

Lund University Publications

Crossref

The IT University of Copenhagen's Repository

Algorithms for Adaptive Game-playing Agents

Author: Justesen Niels Orsleff
Publication venue: IT-Universitetet i København
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

Playing Multi-Action Adversarial Games: Online Evolutionary Planning versus Tree Search

Author: Justesen Niels
Mahlmann Tobias
Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

We address the problem of playing turn-based multi-action adversarial games, which include many strategy games with extremely high branching factors as players take multiple actions each turn. This leads to the breakdown of standard tree search methods, including Monte Carlo Tree Search (MCTS), as they become unable to reach a sufficient depth in the game tree. In this paper we introduce Online Evolutionary Planning (OEP) to address this challenge, which searches for combinations of actions to perform during a single turn guided by a fitness function that evaluates the quality of a particular state. We compare OEP to different MCTS variations that constrain the exploration to deal with the high branching factor in the turn-based multi-action game Hero Academy. While the constrained MCTS variations outperform the vanilla MCTS implementation by a large margin, OEP is able to search the space of plans more efficiently than any of the tested tree search methods as it has a relative advantage when the number of actions per turn increases

Lund University Publications

Crossref

The IT University of Copenhagen's Repository