733 research outputs found

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Unifying an Introduction to Artificial Intelligence Course through Machine Learning Laboratory Experiences

    Full text link
    This paper presents work on a collaborative project funded by the National Science Foundation that incorporates machine learning as a unifying theme to teach fundamental concepts typically covered in the introductory Artificial Intelligence courses. The project involves the development of an adaptable framework for the presentation of core AI topics. This is accomplished through the development, implementation, and testing of a suite of adaptable, hands-on laboratory projects that can be closely integrated into the AI course. Through the design and implementation of learning systems that enhance commonly-deployed applications, our model acknowledges that intelligent systems are best taught through their application to challenging problems. The goals of the project are to (1) enhance the student learning experience in the AI course, (2) increase student interest and motivation to learn AI by providing a framework for the presentation of the major AI topics that emphasizes the strong connection between AI and computer science and engineering, and (3) highlight the bridge that machine learning provides between AI technology and modern software engineering

    Machine Learning for k-in-a-row Type Games Using Random Forest and Genetic Algorithm

    Get PDF
    Antud töö põhieesmärgiks oli uurida kui efektiivne ja mõistlik on kombineerida mitu erinevat masinõppe meetodit, et treenida tehisintellekti k-ritta tüüpi mängudele. Need meetodid on järgnevad: geneetiline algoritm, juhumetsad (koos otsustuspuudega) ning Minimax algoritm. Eriliseks teeb sellise meetodi asjaolu, et kogu intelligents treenitakse ilma inimese ekspert teadmisteta ning kõik vajaliku informatsiooni peab arvuti ise endale omandama.The main objective of the thesis is to explore the viability of combination multiple machine learning techniques in order to train Artificial Intelligence for k-in-a-row type games. The techniques under observation are following: - Random Forest - Minimax Algorithm - Genetic Algorithm The main engine for training AI is Genetic Algorithm where a set of individuals are evolved towards better playing computer intelligence. In the evaluation step, series of games are done where individuals compete in series of games against each other – the results are recorded and the evaluation score of the individuals are based on their performance in the games. During a game, heuristic game tree search algorithm Minimax is used as player move advisor. Each of the competing individuals has a Random Forest attached that is used as the heuristic function in Minimax. The key idea of the training is to evolve as good Random Forests as possible. This is achieved without any help of human expertise by using solely evolutionary training

    Lookahead Pathology in Monte-Carlo Tree Search

    Full text link
    Monte-Carlo Tree Search (MCTS) is an adversarial search paradigm that first found prominence with its success in the domain of computer Go. Early theoretical work established the game-theoretic soundness and convergence bounds for Upper Confidence bounds applied to Trees (UCT), the most popular instantiation of MCTS; however, there remain notable gaps in our understanding of how UCT behaves in practice. In this work, we address one such gap by considering the question of whether UCT can exhibit lookahead pathology -- a paradoxical phenomenon first observed in Minimax search where greater search effort leads to worse decision-making. We introduce a novel family of synthetic games that offer rich modeling possibilities while remaining amenable to mathematical analysis. Our theoretical and experimental results suggest that UCT is indeed susceptible to pathological behavior in a range of games drawn from this family

    A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks

    Get PDF
    In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm

    Search and planning under incomplete information : a study using Bridge card play

    Get PDF
    This thesis investigates problem-solving in domains featuring incomplete information and multiple agents with opposing goals. In particular, we describe Finesse --- a system that forms plans for the problem of declarer play in the game of Bridge. We begin by examining the problem of search. We formalise a best defence model of incomplete information games in which equilibrium point strategies can be identified, and identify two specific problems that can affect algorithms in such domains. In Bridge, we show that the best defence model corresponds to the typical model analysed in expert texts, and examine search algorithms which overcome the problems we have identified. Next, we look at how planning algorithms can be made to cope with the difficulties of such domains. This calls for the development of new techniques for representing uncertainty and actions with disjunctive effects, for coping with an opposition, and for reasoning about compound actions. We tackle these problems with a..

    Nagging: A scalable, fault-tolerant, paradigm for distributed search

    Get PDF
    This paper describes Nagging, a technique for parallelizing search in a heterogeneous distributed computing environment. Nagging exploits the speedup anomaly often observed when parallelizing problems by playing multiple reformulations of the problem or portions of the problem against each other. Nagging is both fault tolerant and robust to long message latencies. In this paper, we show how nagging can be used to parallelize several different algorithms drawn from the artificial intelligence literature, and describe how nagging can be combined with partitioning, the more traditional search parallelization strategy. We present a theoretical analysis of the advantage of nagging with respect to partitioning, and give empirical results obtained on a cluster of 64 processors that demonstrate nagging\u27s effectiveness and scalability as applied to A* search, alphabetaalpha beta minimax game tree search, and the Davis-Putnam algorithm

    Proof for the Equivalence between Some Best-First Algorithms and Depth-First Algorithms for AND/OR Trees

    Get PDF
    When we want to know if it is a win or a loss at a given position of a game (e.g. chess endgame), the process to figure out this problem corresponds to searching an AND/OR tree. AND/OR-tree search is a method for getting a proof solution (win) or a disproof solution (loss) for such a problem. AO* is well-known as a representative algorithm for searching a proof solution in an AND/OR tree. AO* uses only the idea of proof number. Besides, Allis developed pn-search which uses the idea of proof number and disproof number. Both of them are best-first algorithms. There was no efficient depth-first algorithm using (dis)proof number, until Seo developed his originative algorithm which uses only proof number. Besides, Nagai recently developed PDS which is a depth-first algorithm using both proof number and disproof number. In this paper, we give a proof for the equivalence between AO* which is a best-first algorithm and Seo\u27s depth-first algorithm in the meaning of expanding a certain kind of node. Furthermore, we give a proof for the equivalence between pn-search which is a best-first algorithm and df-pn which is a depth-first algorithm we propose in this paper.PAPE
    corecore