733 research outputs found
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Unifying an Introduction to Artificial Intelligence Course through Machine Learning Laboratory Experiences
This paper presents work on a collaborative project funded by the National Science Foundation that incorporates machine learning as a unifying theme to teach fundamental concepts typically covered in the introductory Artificial Intelligence courses. The project involves the development of an adaptable framework for the presentation of core AI topics. This is accomplished through the development, implementation, and testing of a suite of adaptable, hands-on laboratory projects that can be closely integrated into the AI course. Through the design and implementation of learning systems that enhance commonly-deployed applications, our model acknowledges that intelligent systems are best taught through their application to challenging problems. The goals of the project are to (1) enhance the student learning experience in the AI course, (2) increase student interest and motivation to learn AI by providing a framework for the presentation of the major AI topics that emphasizes the strong connection between AI and computer science and engineering, and (3) highlight the bridge that machine learning provides between AI technology and modern software engineering
Machine Learning for k-in-a-row Type Games Using Random Forest and Genetic Algorithm
Antud töö põhieesmärgiks oli uurida kui efektiivne ja mõistlik on kombineerida mitu
erinevat masinõppe meetodit, et treenida tehisintellekti k-ritta tüüpi mängudele. Need
meetodid on järgnevad: geneetiline algoritm, juhumetsad (koos otsustuspuudega) ning
Minimax algoritm. Eriliseks teeb sellise meetodi asjaolu, et kogu intelligents treenitakse ilma
inimese ekspert teadmisteta ning kõik vajaliku informatsiooni peab arvuti ise endale
omandama.The main objective of the thesis is to explore the viability of combination multiple
machine learning techniques in order to train Artificial Intelligence for k-in-a-row type games.
The techniques under observation are following:
- Random Forest
- Minimax Algorithm
- Genetic Algorithm
The main engine for training AI is Genetic Algorithm where a set of individuals are evolved
towards better playing computer intelligence. In the evaluation step, series of games are done
where individuals compete in series of games against each other – the results are recorded and
the evaluation score of the individuals are based on their performance in the games. During a
game, heuristic game tree search algorithm Minimax is used as player move advisor. Each of
the competing individuals has a Random Forest attached that is used as the heuristic function
in Minimax. The key idea of the training is to evolve as good Random Forests as possible. This
is achieved without any help of human expertise by using solely evolutionary training
Lookahead Pathology in Monte-Carlo Tree Search
Monte-Carlo Tree Search (MCTS) is an adversarial search paradigm that first
found prominence with its success in the domain of computer Go. Early
theoretical work established the game-theoretic soundness and convergence
bounds for Upper Confidence bounds applied to Trees (UCT), the most popular
instantiation of MCTS; however, there remain notable gaps in our understanding
of how UCT behaves in practice. In this work, we address one such gap by
considering the question of whether UCT can exhibit lookahead pathology -- a
paradoxical phenomenon first observed in Minimax search where greater search
effort leads to worse decision-making. We introduce a novel family of synthetic
games that offer rich modeling possibilities while remaining amenable to
mathematical analysis. Our theoretical and experimental results suggest that
UCT is indeed susceptible to pathological behavior in a range of games drawn
from this family
A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks
In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm
Search and planning under incomplete information : a study using Bridge card play
This thesis investigates problem-solving in domains featuring incomplete information and multiple agents with opposing goals. In particular, we describe Finesse --- a system that forms plans for the problem of declarer play in the game of Bridge. We begin by examining the problem of search. We formalise a best defence model of incomplete information games in which equilibrium point strategies can be identified, and identify two specific problems that can affect algorithms in such domains. In Bridge, we show that the best defence model corresponds to the typical model analysed in expert texts, and examine search algorithms which overcome the problems we have identified. Next, we look at how planning algorithms can be made to cope with the difficulties of such domains. This calls for the development of new techniques for representing uncertainty and actions with disjunctive effects, for coping with an opposition, and for reasoning about compound actions. We tackle these problems with a..
Nagging: A scalable, fault-tolerant, paradigm for distributed search
This paper describes Nagging, a technique for parallelizing search in a heterogeneous distributed computing environment. Nagging exploits the speedup anomaly often observed when parallelizing problems by playing multiple reformulations of the problem or portions of the problem against each other. Nagging is both fault tolerant and robust to long message latencies. In this paper, we show how nagging can be used to parallelize several different algorithms drawn from the artificial intelligence literature, and describe how nagging can be combined with partitioning, the more traditional search parallelization strategy. We present a theoretical analysis of the advantage of nagging with respect to partitioning, and give empirical results obtained on a cluster of 64 processors that demonstrate nagging\u27s effectiveness and scalability as applied to A* search, minimax game tree search, and the Davis-Putnam algorithm
Proof for the Equivalence between Some Best-First Algorithms and Depth-First Algorithms for AND/OR Trees
When we want to know if it is a win or a loss at a given position of a game (e.g. chess endgame), the process to figure out this problem corresponds to searching an AND/OR tree. AND/OR-tree search is a method for getting a proof solution (win) or a disproof solution (loss) for such a problem. AO* is well-known as a representative algorithm for searching a proof solution in an AND/OR tree. AO* uses only the idea of proof number. Besides, Allis developed pn-search which uses the idea of proof number and disproof number. Both of them are best-first algorithms. There was no efficient depth-first algorithm using (dis)proof number, until Seo developed his originative algorithm which uses only proof number. Besides, Nagai recently developed PDS which is a depth-first algorithm using both proof number and disproof number. In this paper, we give a proof for the equivalence between AO* which is a best-first algorithm and Seo\u27s depth-first algorithm in the meaning of expanding a certain kind of node. Furthermore, we give a proof for the equivalence between pn-search which is a best-first algorithm and df-pn which is a depth-first algorithm we propose in this paper.PAPE
- …