83,033 research outputs found
Biasing MCTS with Features for General Games
This paper proposes using a linear function approximator, rather than a deep
neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for
general games. This is unlikely to match the potential raw playing strength of
DNNs, but has advantages in terms of generality, interpretability and resources
(time and hardware) required for training. Features describing local patterns
are used as inputs. The features are formulated in such a way that they are
easily interpretable and applicable to a wide range of general games, and might
encode simple local strategies. We gradually create new features during the
same self-play training process used to learn feature weights. We evaluate the
playing strength of an MCTS player biased by learnt features against a standard
upper confidence bounds for trees (UCT) player in multiple different board
games, and demonstrate significantly improved playing strength in the majority
of them after a small number of self-play training games.Comment: Accepted at IEEE CEC 2019, Special Session on Games. Copyright of
final version held by IEE
Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates
In recent years, state-of-the-art game-playing agents often involve policies
that are trained in self-playing processes where Monte Carlo tree search (MCTS)
algorithms and trained policies iteratively improve each other. The strongest
results have been obtained when policies are trained to mimic the search
behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design,
includes an element of exploration, policies trained in this manner are also
likely to exhibit a similar extent of exploration. In this paper, we are
interested in learning policies for a project with future goals including the
extraction of interpretable strategies, rather than state-of-the-art
game-playing performance. For these goals, we argue that such an extent of
exploration is undesirable, and we propose a novel objective function for
training policies that are not exploratory. We derive a policy gradient
expression for maximising this objective function, which can be estimated using
MCTS value estimates, rather than MCTS visit counts. We empirically evaluate
various properties of resulting policies, in a variety of board games.Comment: Accepted at the IEEE Conference on Games (CoG) 201
Automated Game Design Learning
While general game playing is an active field of research, the learning of
game design has tended to be either a secondary goal of such research or it has
been solely the domain of humans. We propose a field of research, Automated
Game Design Learning (AGDL), with the direct purpose of learning game designs
directly through interaction with games in the mode that most people experience
games: via play. We detail existing work that touches the edges of this field,
describe current successful projects in AGDL and the theoretical foundations
that enable them, point to promising applications enabled by AGDL, and discuss
next steps for this exciting area of study. The key moves of AGDL are to use
game programs as the ultimate source of truth about their own design, and to
make these design properties available to other systems and avenues of inquiry.Comment: 8 pages, 2 figures. Accepted for CIG 201
Generating Levels That Teach Mechanics
The automatic generation of game tutorials is a challenging AI problem. While
it is possible to generate annotations and instructions that explain to the
player how the game is played, this paper focuses on generating a gameplay
experience that introduces the player to a game mechanic. It evolves small
levels for the Mario AI Framework that can only be beaten by an agent that
knows how to perform specific actions in the game. It uses variations of a
perfect A* agent that are limited in various ways, such as not being able to
jump high or see enemies, to test how failing to do certain actions can stop
the player from beating the level.Comment: 8 pages, 7 figures, PCG Workshop at FDG 2018, 9th International
Workshop on Procedural Content Generation (PCG2018
Generating Instructions in a 3D Game Environment: Efficiency or Entertainment?
The GIVE Challenge was designed for the evaluation of natural language generation (NLG) systems. It involved the automatic generation of instructions for users in a 3D environment. In this paper we introduce two NLG systems that we developed for this challenge. One system focused on generating optimally helpful instructions while the other focused on entertainment. We used the data gathered in the Challenge to compare the efficiency and entertainment value of both systems. We found a clear difference in efficiency, but were unable to prove that one system was more entertaining than the other. This could be explained by the fact that the set-up and evaluation methods of the GIVE Challenge were not aimed at entertainment
Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play
Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge
Pit platform. We briefly describe the scope and background of this competition
in the context of a more general project related to the development of an AI
engine for video games, called Grail. We also discuss the outcomes of this
challenge and demonstrate how predictive models for the assessment of player's
winning chances can be utilized in a construction of an intelligent agent for
playing Hearthstone. Finally, we show a few selected machine learning
approaches for modeling state and action values in Hearthstone. We provide
evaluation for a few promising solutions that may be used to create more
advanced types of agents, especially in conjunction with Monte Carlo Tree
Search algorithms.Comment: Federated Conference on Computer Science and Information Systems,
Prague (FedCSIS-2017) (Prague, Czech Republic
- …