Search CORE

139 research outputs found

Biasing MCTS with Features for General Games

Author: Browne Cameron
Piette Éric
Soemers Dennis J. N. J.
Publication venue
Publication date: 01/01/2019
Field of study

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.Comment: Accepted at IEEE CEC 2019, Special Session on Games. Copyright of final version held by IEE

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

DIAL UCLouvain

Fast Evolutionary Adaptation for Monte Carlo Tree Search

Author: Agapitos A.
Burelli P.
Bush W.
C. C.
Cagnoni Cotta
De Falco I.
DELLA CIOPPA Antonio
Divina F.
Drechsler R.
Eiben A. E.
Esparcia Alcázar A. I.
Fernanandez de Vega F.
Glette K.
Haasdijk E.
Hidalgo J. I.
Kaufmann P.
Mora A. M.
Nugyen T. T.
Posik P.
Schaefer R.
Sim K.
Simoes A.
Tettamanzi A.
Urquhart N.
Zhang M.
Zincir Heywwod N.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This paper describes a new adaptive Monte Carlo Tree Search (MCTS) algorithm that uses evolution to rapidly optimise its performance. An evolutionary algorithm is used as a source of control parameters to modify the behaviour of each iteration (i.e. each simulation or roll-out) of the MCTS algorithm; in this paper we largely restrict this to modifying the behaviour of the random default policy, though it can also be applied to modify the tree policy

University of Essex Research Repository

Crossref

VU Research Portal

Archivio della Ricerca - Università di Salerno

Institutional Research Information System University of Turin

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Warm-Start AlphaZero Self-Play Search Enhancements

Author: C Browne
CD Rosin
D Silver
D Silver
D Silver
EA Heinz
G Tesauro
H Wang
J Schmidhuber
J Tao
LV Allis
M Buro
MA Wiering
ML Zhang
N Justesen
N Srivastava
O Vinyals
R Coulom
R Coulom
RD Gaina
S Gelly
S Iwata
S Reisch
SY Chong
TP Runarsson
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/04/2020
Field of study

Recently, AlphaZero has achieved landmark results in deep reinforcement learning, by providing a single self-play architecture that learned three different games at super human level. AlphaZero is a large and complicated system with many parameters, and success requires much compute power and fine-tuning. Reproducing results in other games is a challenge, and many researchers are looking for ways to improve results while reducing computational demands. AlphaZero's design is purely based on self-play and makes no use of labeled expert data ordomain specific enhancements; it is designed to learn from scratch. We propose a novel approach to deal with this cold-start problem by employing simple search enhancements at the beginning phase of self-play training, namely Rollout, Rapid Action Value Estimate (RAVE) and dynamically weighted combinations of these with the neural network, and Rolling Horizon Evolutionary Algorithms (RHEA). Our experiments indicate that most of these enhancements improve the performance of their baseline player in three different (small) board games, with especially RAVE based variants playing strongly

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

The 2016 Two-Player GVGAI Competition

Author: Couetoux A
Gaina RD
Kirchgessner F
Liu J
Lucas SM
Perez-Liebana D
Soemers DJNJ
Vodopivec T
Winands MHM
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/01/2018
Field of study

This paper showcases the setting and results of the first Two-Player General Video Game AI competition, which ran in 2016 at the IEEE World Congress on Computational Intelligence and the IEEE Conference on Computational Intelligence and Games. The challenges for the general game AI agents are expanded in this track from the single-player version, looking at direct player interaction in both competitive and cooperative environments of various types and degrees of difficulty. The focus is on the agents not only handling multiple problems, but also having to account for another intelligent entity in the game, who is expected to work towards their own goals (winning the game). This other player will possibly interact with first agent in a more engaging way than the environment or any non-playing character may do. The top competition entries are analyzed in detail and the performance of all agents is compared across the four sets of games. The results validate the competition system in assessing generality, as well as showing Monte Carlo Tree Search continuing to dominate by winning the overall Championship. However, this approach is closely followed by Rolling Horizon Evolutionary Algorithms, employed by the winner of the second leg of the contest

University of Essex Research Repository

Maastricht University Research Portal

Crossref

Queen Mary Research Online

On monte carlo tree search and reinforcement learning

Author: Brank Ster
Samothrakis Spyridon
Tom Vodopivec
Publication venue: 'AI Access Foundation'
Publication date: 20/12/2017
Field of study

Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

University of Essex Research Repository

Crossref

Action Guidance with MCTS for Deep Reinforcement Learning

Author: Hernandez-Leal Pablo
Kartal Bilal
Taylor Matthew E.
Publication venue
Publication date: 25/07/2019
Field of study

Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with arXiv:1904.05759, arXiv:1812.0004

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications