Search CORE

554 research outputs found

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Crossref

On monte carlo tree search and reinforcement learning

Author: Brank Ster
Samothrakis Spyridon
Tom Vodopivec
Publication venue: 'AI Access Foundation'
Publication date: 20/12/2017
Field of study

Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

University of Essex Research Repository

Crossref

Exploiting Game Decompositions in Monte Carlo Tree Search

Author: Hufschmitt Aline
Jouandeau Nicolas
Vittaut Jean-Noël
Publication venue: HAL CCSD
Publication date: 01/08/2019
Field of study

International audienceIn this paper, we propose a variation of the MCTS framework to perform a search in several trees to exploit game decompositions. Our Multiple Tree MCTS (MT-MCTS) approach builds simultaneously multiple MCTS trees corresponding to the different sub-games and allows , like MCTS algorithms, to evaluate moves while playing. We apply MT-MCTS on decomposed games in the General Game Playing framework. We present encouraging results on single player games showing that this approach is promising and opens new avenues for further research in the domain of decomposition exploitation. Complex compound games are solved from 2 times faster (Incredible) up to 25 times faster (Nonogram)

Decentralized Cooperative Planning for Automated Vehicles with Continuous Monte Carlo Tree Search

Author: Engelhorn Florian
Kurzer Karl
Zöllner J. Marius
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/09/2018
Field of study

Urban traffic scenarios often require a high degree of cooperation between traffic participants to ensure safety and efficiency. Observing the behavior of others, humans infer whether or not others are cooperating. This work aims to extend the capabilities of automated vehicles, enabling them to cooperate implicitly in heterogeneous environments. Continuous actions allow for arbitrary trajectories and hence are applicable to a much wider class of problems than existing cooperative approaches with discrete action spaces. Based on cooperative modeling of other agents, Monte Carlo Tree Search (MCTS) in conjunction with Decoupled-UCT evaluates the action-values of each agent in a cooperative and decentralized way, respecting the interdependence of actions among traffic participants. The extension to continuous action spaces is addressed by incorporating novel MCTS-specific enhancements for efficient search space exploration. The proposed algorithm is evaluated under different scenarios, showing that the algorithm is able to achieve effective cooperative planning and generate solutions egocentric planning fails to identify

arXiv.org e-Print Archive

Crossref

Enhancing the Monte Carlo Tree Search Algorithm for Video Game Testing

Author: Ariyurek Sinan
Betin-Can Aysu
Surer Elif
Publication venue
Publication date: 01/03/2020
Field of study

In this paper, we study the effects of several Monte Carlo Tree Search (MCTS) modifications for video game testing. Although MCTS modifications are highly studied in game playing, their impacts on finding bugs are blank. We focused on bug finding in our previous study where we introduced synthetic and human-like test goals and we used these test goals in Sarsa and MCTS agents to find bugs. In this study, we extend the MCTS agent with several modifications for game testing purposes. Furthermore, we present a novel tree reuse strategy. We experiment with these modifications by testing them on three testbed games, four levels each, that contain 45 bugs in total. We use the General Video Game Artificial Intelligence (GVG-AI) framework to create the testbed games and collect 427 human tester trajectories using the GVG-AI framework. We analyze the proposed modifications in three parts: we evaluate their effects on bug finding performances of agents, we measure their success under two different computational budgets, and we assess their effects on human-likeness of the human-like agent. Our results show that MCTS modifications improve the bug finding performance of the agents

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Guiding Monte Carlo tree searches with neural networks in the game of go

Author: Ferreira Gonçalo Antunes Mendes
Publication venue: Instituto Superior de Engenharia de Lisboa
Publication date: 06/06/2016
Field of study

A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer EngineeringO jogo de tabuleiro Go é um dos poucos jogos determinísticos em que os computadores ainda não conseguem vencer jogadores humanos profissionais consistentemente. Neste trabalho dois métodos de aprendizagem – por um algoritmo genético e por treino por propagação do erro – são utilizados para criar redes neuronais capazes de assistir um algoritmo de pesquisa em árvore de Monte Carlo. Este último algoritmo tem sido o mais bem sucedido na última década de investigação sobre Go. A utilização de uma rede neuronal é uma abordagem que está a sofrer uma revitalização, com os recentes sucessos de redes neuronais profundas de convolução. Estas necessitam, contudo, de recursos que ainda são muitas vezes proibitivos. Este trabalho explora o impacto de redes neuronais mais simples e a produção de um software representativo do estado da arte. Para isto é complementado com técnicas para pesquisas em árvore de Monte Carlo, aquisição automática de conhecimento, paralelismo em árvore, otimização e outros problemas presentes na computação aplicada ao jogo de Go. O software produzido – Matilda – é por isso o culminar de um conjunto de experiências nesta área.Abstract: The game of Go remains one of the few deterministic perfect information games where computer players still struggle against professional human players. In this work two methods of derivation of artificial neural networks – by genetic evolution of symbiotic populations, and by training of multilayer perceptron networks with backpropagation – are analyzed for the production of a neural network suitable for guiding a Monte Carlo tree search algorithm. This last family of algorithms has been the most successful in computer Go software in the last decade. Using a neural network to reduce the branching complexity of the search is an approach to the problema that is currently being revitalized, with the advent of the application of deep convolution neural networks. DCNN however require computational facilities that many computers still don’t have. This work explores the impact of simpler neural networks for the purpose of guiding Monte Carlo tree searches, and the production of a state-of-the-art computer Go program. For this several improvements to Monte Carlo tree searches are also explored. The work is further built upon with considerations related to the parallelization of the search, and the addition of other componentes necessary for competitive programs such as time control mechanisms and opening books. Time considerations for playing against humans are also proposed for na extra psychological advantage. The final software – named Matilda– is not only the sum of a series of experimental parts surrounding Monte Carlo Tree Search applied to Go, but also an attempt at the strongest possible solution for shared memory systems

Repositório Científico do Instituto Politécnico de Lisboa

Performance of the Parallelized Monte-Carlo Tree Search Approach for Dots and Boxes

Author: Agrawal Pranay
Ziegler Uta
Publication venue: Murray State\u27s Digital Commons
Publication date: 08/11/2018
Field of study

The Monte-Carlo tree search (MCTS) is a method designed to solve difficult learning problems. MCTS performs random simulations from the current situation and stores the results in order to distinguish decisions based on their past success. MCTS will then select the best decision and finally repeat the process. Parallelizing the MCTS means to divide the learning process among independent learners. Then, after a fixed number of simulations, the data is shared and combined. Past research has shown that this approach is faster than non-parallelized approaches. Therefore, we anticipated that the time reduced from dividing the learning outweighs the potential costs from redundant learning. Since it is often difficult to determine the effectiveness of algorithms in complex environments, it is sometimes more advantageous to develop strategies in simple environments such as games that can then be translated for use in broader real-life fields. In this project, we explored how controlling various resources affected the win-ratio performance of the game Dots and Boxes learned through a parallelized Monte Carlo Tree Search approach. The factors that we manipulated included the number of simulations, the number of independent learners, the amount of information shared from these independent learners, and how frequently the independent learners share. The win-ratio performance was determined by taking the number of wins over the number of total games. An algorithm is presented with our findings, along with details and results of our modified Monte-Carlo tree search implementation

Murray State University