9 research outputs found

    Scalable Parallel DFPN Search

    Full text link
    Abstract. We present Scalable Parallel Depth-First Proof Number Search, a new shared-memory parallel version of depth-first proof number search. Based on the serial DFPN 1+ε method of Pawlewicz and Lew, SPDFPN searches effectively even as the transposition table becomes almost full, and so can solve large prob-lems. To assign jobs to threads, SPDFPN uses proof and disproof numbers and two parameters. SPDFPN uses no domain-specific knowledge or heuristics, so it can be used in any domain. Our experiments show that SPDFPN scales well and performs well on hard problems. We tested SPDFPN on problems from the game of Hex. On a 24-core machine and a 4.2-hour single-thread task, parallel efficiency ranges from 0.8 on 4 threads to 0.74 on 16 threads. SPDFPN solved all previously intractable 9×9 Hex open-ing moves; the hardest opening took 111 days. Also, in 63 days, it solved one 10×10 Hex opening move. This is the first time a computer or human has solved a 10×10 Hex opening move.

    Multi-agent Monte Carlo go

    Get PDF
    In this paper we propose a Multi-Agent version of UCT Monte Carlo Go. We use the emergent behavior of a great number of simple agents to increase the quality of the Monte Carlo simulations, increasing the strength of the artificial player as a whole. Instead of one agent playing against itself, different agents play in the simulation phase of the algorithm, leading to a better exploration of the search space. We could significantly overcome Fuego, a top Computer Go software. Emergent behavior seems to be the next step of Computer Go development

    Monte-Carlo tree search with heuristic knowledge: A novel way in solving capturing and life and death problems in Go

    Get PDF
    Monte-Carlo (MC) tree search is a new research field. Its effectiveness in searching large state spaces, such as the Go game tree, is well recognized in the computer Go community. Go domain- specific heuristics and techniques as well as domain-independent heuristics and techniques are sys- tematically investigated in the context of the MC tree search in this dissertation. The search extensions based on these heuristics and techniques can significantly improve the effectiveness and efficiency of the MC tree search. Two major areas of investigation are addressed in this dissertation research: I. The identification and use of the effective heuristic knowledge in guiding the MC simulations, II. The extension of the MC tree search algorithm with heuristics. Go, the most challenging board game to the machine, serves as the test bed. The effectiveness of the MC tree search extensions is demonstrated through the performances of Go tactic problem solvers using these techniques. The main contributions of this dissertation include: 1. A heuristics based Monte-Carlo tactic tree search framework is proposed to extend the standard Monte-Carlo tree search. 2. (Go) Knowledge based heuristics are systematically investigated to improve the Monte-Carlo tactic tree search. 3. Pattern learning is demonstrated as effective in improving the Monte-Carlo tactic tree search. 4. Domain knowledge independent tree search enhancements are shown as effective in improving the Monte-Carlo tactic tree search performances. 5. A strong Go Tactic solver based on proposed algorithms outperforms traditional game tree search algorithms. The techniques developed in this dissertation research can benefit other game domains and ap- plication fields

    On forward pruning in game-tree search

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Intelligence artificielle et optimisation avec parallélisme

    Get PDF
    This document is devoted to artificial intelligence and optimization. This part will bedevoted to having fun with high level ideas and to introduce the subject. Thereafter,Part II will be devoted to Monte-Carlo Tree Search, a recent great tool for sequentialdecision making; we will only briefly discuss other tools for sequential decision making;the complexity of sequential decision making will be reviewed. Then, part IIIwill discuss optimization, with a particular focus on robust optimization and especiallyevolutionary optimization. Part IV will present some machine learning tools, useful ineveryday life, such as supervised learning and active learning. A conclusion (part V)will come back to fun and to high level ideas.On parlera ici de Monte-Carlo Tree Search, d'UCT, d'algorithmes évolutionnaires et d'autres trucs et astuces d'IA;l'accent sera mis sur la parallélisation

    Application of temporal difference learning and supervised learning in the game of Go.

    Get PDF
    by Horace Wai-Kit, Chan.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 109-112).Acknowledgement --- p.iAbstract --- p.iiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview --- p.1Chapter 1.2 --- Objective --- p.3Chapter 1.3 --- Organization of This Thesis --- p.3Chapter 2 --- Background --- p.5Chapter 2.1 --- Definitions --- p.5Chapter 2.1.1 --- Theoretical Definition of Solving a Game --- p.5Chapter 2.1.2 --- Definition of Computer Go --- p.7Chapter 2.2 --- State of the Art of Computer Go --- p.7Chapter 2.3 --- A Framework for Computer Go --- p.11Chapter 2.3.1 --- Evaluation Function --- p.11Chapter 2.3.2 --- Plausible Move Generator --- p.14Chapter 2.4 --- Problems Tackled in this Research --- p.14Chapter 3 --- Application of TD in Game Playing --- p.15Chapter 3.1 --- Introduction --- p.15Chapter 3.2 --- Reinforcement Learning and TD Learning --- p.15Chapter 3.2.1 --- Models of Learning --- p.16Chapter 3.2.2 --- Temporal Difference Learning --- p.16Chapter 3.3 --- TD Learning and Game-playing --- p.20Chapter 3.3.1 --- Game-Playing as a Delay-reward Prediction Problem --- p.20Chapter 3.3.2 --- Previous Work of TD Learning in Backgammon --- p.20Chapter 3.3.3 --- Previous Works of TD Learning in Go --- p.22Chapter 3.4 --- Design of this Research --- p.23Chapter 3.4.1 --- Limitations in the Previous Researches --- p.24Chapter 3.4.2 --- Motivation --- p.25Chapter 3.4.3 --- Objective and Methodology --- p.26Chapter 4 --- Deriving a New Updating Rule to Apply TD Learning in Multi-layer Perceptron --- p.28Chapter 4.1 --- Multi-layer Perceptron (MLP) --- p.28Chapter 4.2 --- Derivation of TD(A) Learning Rule for MLP --- p.31Chapter 4.2.1 --- Notations --- p.31Chapter 4.2.2 --- A New Generalized Delta Rule --- p.31Chapter 4.2.3 --- Updating rule for TD(A) Learning --- p.34Chapter 4.3 --- Algorithm of Training MLP using TD(A) --- p.35Chapter 4.3.1 --- Definitions of Variables in the Algorithm --- p.35Chapter 4.3.2 --- Training Algorithm --- p.36Chapter 4.3.3 --- Description of the Algorithm --- p.39Chapter 5 --- Experiments --- p.41Chapter 5.1 --- Introduction --- p.41Chapter 5.2 --- Experiment 1 : Training Evaluation Function for 7 x 7 Go Games by TD(λ) with Self-playing --- p.42Chapter 5.2.1 --- Introduction --- p.42Chapter 5.2.2 --- 7 x 7 Go --- p.42Chapter 5.2.3 --- Experimental Designs --- p.43Chapter 5.2.4 --- Performance Testing for Trained Networks --- p.44Chapter 5.2.5 --- Results --- p.44Chapter 5.2.6 --- Discussions --- p.45Chapter 5.2.7 --- Limitations --- p.47Chapter 5.3 --- Experiment 2 : Training Evaluation Function for 9 x 9 Go Games by TD(λ) Learning from Human Games --- p.47Chapter 5.3.1 --- Introduction --- p.47Chapter 5.3.2 --- 9x 9 Go game --- p.48Chapter 5.3.3 --- Training Data Preparation --- p.49Chapter 5.3.4 --- Experimental Designs --- p.50Chapter 5.3.5 --- Results --- p.52Chapter 5.3.6 --- Discussion --- p.54Chapter 5.3.7 --- Limitations --- p.56Chapter 5.4 --- Experiment 3 : Life Status Determination in the Go Endgame --- p.57Chapter 5.4.1 --- Introduction --- p.57Chapter 5.4.2 --- Training Data Preparation --- p.58Chapter 5.4.3 --- Experimental Designs --- p.60Chapter 5.4.4 --- Results --- p.64Chapter 5.4.5 --- Discussion --- p.65Chapter 5.4.6 --- Limitations --- p.66Chapter 5.5 --- A Postulated Model --- p.66Chapter 6 --- Conclusions --- p.69Chapter 6.1 --- Future Direction of Research --- p.71Chapter A --- An Introduction to Go --- p.72Chapter A.l --- A Brief Introduction --- p.72Chapter A.1.1 --- What is Go? --- p.72Chapter A.1.2 --- History of Go --- p.72Chapter A.1.3 --- Equipment used in a Go game --- p.73Chapter A.2 --- Basic Rules in Go --- p.74Chapter A.2.1 --- A Go game --- p.74Chapter A.2.2 --- Liberty and Capture --- p.75Chapter A.2.3 --- Ko --- p.77Chapter A.2.4 --- "Eyes, Live and Death" --- p.81Chapter A.2.5 --- Seki --- p.83Chapter A.2.6 --- Endgame and Scoring --- p.83Chapter A.2.7 --- Rank and Handicap Games --- p.85Chapter A.3 --- Strategies and Tactics in Go --- p.87Chapter A.3.1 --- Strategy vs Tactics --- p.87Chapter A.3.2 --- Open-game --- p.88Chapter A.3.3 --- Middle-game --- p.91Chapter A.3.4 --- End-game --- p.92Chapter B --- Mathematical Model of Connectivity --- p.94Chapter B.1 --- Introduction --- p.94Chapter B.2 --- Basic Definitions --- p.94Chapter B.3 --- Adjacency and Connectivity --- p.96Chapter B.4 --- String and Link --- p.98Chapter B.4.1 --- String --- p.98Chapter B.4.2 --- Link --- p.98Chapter B.5 --- Liberty and Atari --- p.99Chapter B.5.1 --- Liberty --- p.99Chapter B.5.2 --- Atari --- p.101Chapter B.6 --- Ko --- p.101Chapter B.7 --- Prohibited Move --- p.104Chapter B.8 --- Path and Distance --- p.105Bibliography --- p.10

    Proceedings of the 18th Irish Conference on Artificial Intelligence and Cognitive Science

    Get PDF
    These proceedings contain the papers that were accepted for publication at AICS-2007, the 18th Annual Conference on Artificial Intelligence and Cognitive Science, which was held in the Technological University Dublin; Dublin, Ireland; on the 29th to the 31st August 2007. AICS is the annual conference of the Artificial Intelligence Association of Ireland (AIAI)
    corecore