7 research outputs found

    Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups

    Full text link
    Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic alpha-beta search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc

    Game Solving with Online Fine-Tuning

    Full text link
    Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The AlphaZero algorithm has demonstrated super-human level play, and its powerful policy and value predictions have also served as heuristics in game solving. However, to solve a game and obtain a full strategy, a winning response must be found for all possible moves by the losing player. This includes very poor lines of play from the losing side, for which the AlphaZero self-play process will not encounter. AlphaZero-based heuristics can be highly inaccurate when evaluating these out-of-distribution positions, which occur throughout the entire search. To address this issue, this paper investigates applying online fine-tuning while searching and proposes two methods to learn tailor-designed heuristics for game solving. Our experiments show that using online fine-tuning can solve a series of challenging 7x7 Killall-Go problems, using only 23.54% of computation time compared to the baseline without online fine-tuning. Results suggest that the savings scale with problem size. Our method can further be extended to any tree search algorithm for problem solving. Our code is available at https://rlg.iis.sinica.edu.tw/papers/neurips2023-online-fine-tuning-solver.Comment: Accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Monte-Carlo tree search enhancements for one-player and two-player domains

    Get PDF

    Deep learning and computer chess (part 2)

    No full text
    The dominant approach to computer chess has typically been through the use of Minimax-based chess engines. In recent years, Monte Carlo Tree Search (MCTS) game engines have seen success, with the advent of AlphaZero and Leela Chess Zero. However, there is still much to explore regarding the use of MCTS in the domain of chess. This paper evaluates the efficacy of an MCTS-based engine in the area of chess. On top of the base MCTS engine, several enhancements were proposed and implemented, including early playout termination, progressive bias, progressive unpruning, decisive moves, epsilon-greedy search, score bounded Monte-Carlo tree search, and root parallelization. Each enhancement was implemented in stages, and the performance of the enhancement was measured by comparing it to the model from the previous stage. It was determined that early playout termination, progressive unpruning, score bounded search, and root parallelization were effective in improving the playing strength of the engine. However, decisive moves and epsilon-greedy search negatively impacted the engine’s performance. From the results, it appears that it is possible to adapt an MCTS model to the realm of chess through the aid of several enhancements such that can compete with the traditional Minimax approach, with much room for improvement available.Bachelor of Engineering (Computer Science
    corecore