37,377 research outputs found

    Monte-Carlo tree search with heuristic knowledge: A novel way in solving capturing and life and death problems in Go

    Get PDF
    Monte-Carlo (MC) tree search is a new research field. Its effectiveness in searching large state spaces, such as the Go game tree, is well recognized in the computer Go community. Go domain- specific heuristics and techniques as well as domain-independent heuristics and techniques are sys- tematically investigated in the context of the MC tree search in this dissertation. The search extensions based on these heuristics and techniques can significantly improve the effectiveness and efficiency of the MC tree search. Two major areas of investigation are addressed in this dissertation research: I. The identification and use of the effective heuristic knowledge in guiding the MC simulations, II. The extension of the MC tree search algorithm with heuristics. Go, the most challenging board game to the machine, serves as the test bed. The effectiveness of the MC tree search extensions is demonstrated through the performances of Go tactic problem solvers using these techniques. The main contributions of this dissertation include: 1. A heuristics based Monte-Carlo tactic tree search framework is proposed to extend the standard Monte-Carlo tree search. 2. (Go) Knowledge based heuristics are systematically investigated to improve the Monte-Carlo tactic tree search. 3. Pattern learning is demonstrated as effective in improving the Monte-Carlo tactic tree search. 4. Domain knowledge independent tree search enhancements are shown as effective in improving the Monte-Carlo tactic tree search performances. 5. A strong Go Tactic solver based on proposed algorithms outperforms traditional game tree search algorithms. The techniques developed in this dissertation research can benefit other game domains and ap- plication fields

    Monte-Carlo Search in Games

    Get PDF
    This paper implements and analyzes four algorithms for improving computer play of the board game Go. These algorithms use machine pattern learning to find better Monte-Carlo simulation policies for use with Monte-Carlo Tree Search. Two of these algorithms maximize individual move strength, and two minimize overall simulation error. These algorithms are tested using UCT on 9x9 Go with 3x3 patterns

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Guiding Monte Carlo tree searches with neural networks in the game of go

    Get PDF
    A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer EngineeringO jogo de tabuleiro Go é um dos poucos jogos determinísticos em que os computadores ainda não conseguem vencer jogadores humanos profissionais consistentemente. Neste trabalho dois métodos de aprendizagem – por um algoritmo genético e por treino por propagação do erro – são utilizados para criar redes neuronais capazes de assistir um algoritmo de pesquisa em árvore de Monte Carlo. Este último algoritmo tem sido o mais bem sucedido na última década de investigação sobre Go. A utilização de uma rede neuronal é uma abordagem que está a sofrer uma revitalização, com os recentes sucessos de redes neuronais profundas de convolução. Estas necessitam, contudo, de recursos que ainda são muitas vezes proibitivos. Este trabalho explora o impacto de redes neuronais mais simples e a produção de um software representativo do estado da arte. Para isto é complementado com técnicas para pesquisas em árvore de Monte Carlo, aquisição automática de conhecimento, paralelismo em árvore, otimização e outros problemas presentes na computação aplicada ao jogo de Go. O software produzido – Matilda – é por isso o culminar de um conjunto de experiências nesta área.Abstract: The game of Go remains one of the few deterministic perfect information games where computer players still struggle against professional human players. In this work two methods of derivation of artificial neural networks – by genetic evolution of symbiotic populations, and by training of multilayer perceptron networks with backpropagation – are analyzed for the production of a neural network suitable for guiding a Monte Carlo tree search algorithm. This last family of algorithms has been the most successful in computer Go software in the last decade. Using a neural network to reduce the branching complexity of the search is an approach to the problema that is currently being revitalized, with the advent of the application of deep convolution neural networks. DCNN however require computational facilities that many computers still don’t have. This work explores the impact of simpler neural networks for the purpose of guiding Monte Carlo tree searches, and the production of a state-of-the-art computer Go program. For this several improvements to Monte Carlo tree searches are also explored. The work is further built upon with considerations related to the parallelization of the search, and the addition of other componentes necessary for competitive programs such as time control mechanisms and opening books. Time considerations for playing against humans are also proposed for na extra psychological advantage. The final software – named Matilda– is not only the sum of a series of experimental parts surrounding Monte Carlo Tree Search applied to Go, but also an attempt at the strongest possible solution for shared memory systems

    Improving MCTS and Neural Network Communication in Computer Go

    Get PDF
    In March 2016, AlphaGo, a computer Go program developed by Google DeepMind, won a 5-game match against Lee Sedol, one of the best Go players in the world. Its victory marks a major advance in the field of computer Go. However, much remains to be done. There is a gap between the computational power AlphaGo used in the match and the computational power available to the majority of computer users today. Further, the communication between two of the techniques used by AlphaGo, neural networks and Monte Carlo Tree Search, can be improved. We investigate four different approaches towards accomplishing this end, with a focus on methods that require minimal computational power. Each method shows promise and can be developed further

    On monte carlo tree search and reinforcement learning

    Get PDF
    Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

    Improvements to MCTS Simulation Policies in Go

    Get PDF
    Since its introduction in 2006, Monte-Carlo Tree Search has been a major breakthrough in computer Go. Performance of an MCTS engine is highly dependent on the quality of its simulations, though despite this, simulations remain one of the most poorly understand aspects of MCTS. In this paper, we explore in-depth the simulations policy of Pachi, an open-source computer Go agent. This research attempts to better understand how simulation policies affect the overall performance of MCTS, building on prior work in the field by doing so. Through this research we develop a deeper understanding of the underlying components in Pachi\u27s simulation policy, which are common to many modern MCTS Go engines, and evaluate the metrics used to measure them

    Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation

    Full text link
    Monte Carlo tree search (MCTS) is extremely popular in computer Go which determines each action by enormous simulations in a broad and deep search tree. However, human experts select most actions by pattern analysis and careful evaluation rather than brute search of millions of future nteractions. In this paper, we propose a computer Go system that follows experts way of thinking and playing. Our system consists of two parts. The first part is a novel deep alternative neural network (DANN) used to generate candidates of next move. Compared with existing deep convolutional neural network (DCNN), DANN inserts recurrent layer after each convolutional layer and stacks them in an alternative manner. We show such setting can preserve more contexts of local features and its evolutions which are beneficial for move prediction. The second part is a long-term evaluation (LTE) module used to provide a reliable evaluation of candidates rather than a single probability from move predictor. This is consistent with human experts nature of playing since they can foresee tens of steps to give an accurate estimation of candidates. In our system, for each candidate, LTE calculates a cumulative reward after several future interactions when local variations are settled. Combining criteria from the two parts, our system determines the optimal choice of next move. For more comprehensive experiments, we introduce a new professional Go dataset (PGD), consisting of 253233 professional records. Experiments on GoGoD and PGD datasets show the DANN can substantially improve performance of move prediction over pure DCNN. When combining LTE, our system outperforms most relevant approaches and open engines based on MCTS.Comment: AAAI 201

    A Controlled Searching of Game Trees

    Get PDF
    Title: A Controlled Searching of Game Trees Author: Jan Vrba Department: Department of Theoretical Computer Science and Mathematical Logic Supervisor: RNDr. Jan Hric Abstract: Monte-Carlo Tree Search is a search algorithm based on random Monte- Carlo playouts. Since it was first introduced in 2006, it has been successfully used in several areas. Most notably for the game Go. MCTS is intended mainly for problems with too large a state space to be fully explored in reasonable time. Working with a large state space and the fact that when evaluating a node, it first explores all possible moves leads to large memory complexity. This work explores options a user can use to regulate memory complexity based on the results of previous Monte-Carlo playouts. Keywords: MCTS, UCT, BMCTS, RAVENázev práce: Řízení prohledávání stromů hry Autor: Jan Vrba Katedra / Ústav: Katedra teoretické informatiky a matematické logiky Vedoucí bakalářské práce: RNDr. Jan Hric Abstrakt: Monte-Carlo Tree Search je metodou prohledávání herního stromu na základě náhodných Monte-Carlo sehrávek. Od doby, kdy byla tato metoda poprvé představena v roce 2006, byla úspěšně použita v mnoha různých oblastech. Za zvláštní zmínku stojí její použití pro hru Go. MCTS je určeno především pro problémy, které mají příliš velký stavový prostor, aby šel prohledat v rozumném čase. Práce s velkým stavovým prostorem a skutečnost, že metoda při rozvíjení vrcholu každý možný tah nejdříve jednou vyzkouší vede k velkým paměťovým nárokům. Tato práce se zabývá možnostmi, kterými může uživatel, v závislosti na provedených náhodných sehrávkách, regulovat paměťovou komplexitu. Klíčová slova: MCTS, UCT, BMCTS, RAVEKatedra teoretické informatiky a matematické logikyDepartment of Theoretical Computer Science and Mathematical LogicMatematicko-fyzikální fakultaFaculty of Mathematics and Physic
    corecore