7 research outputs found
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Monte Carlo Tree Search (MCTS) has improved the performance of game engines
in domains such as Go, Hex, and general game playing. MCTS has been shown to
outperform classic alpha-beta search in games where good heuristic evaluations
are difficult to obtain. In recent years, combining ideas from traditional
minimax search in MCTS has been shown to be advantageous in some domains, such
as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new
way to use heuristic evaluations to guide the MCTS search by storing the two
sources of information, estimated win rates and heuristic evaluations,
separately. Rather than using the heuristic evaluations to replace the
playouts, our technique backs them up implicitly during the MCTS simulations.
These minimax values are then used to guide future simulations. We show that
using implicit minimax backups leads to stronger play performance in Kalah,
Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at
IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc
Game Solving with Online Fine-Tuning
Game solving is a similar, yet more difficult task than mastering a game.
Solving a game typically means to find the game-theoretic value (outcome given
optimal play), and optionally a full strategy to follow in order to achieve
that outcome. The AlphaZero algorithm has demonstrated super-human level play,
and its powerful policy and value predictions have also served as heuristics in
game solving. However, to solve a game and obtain a full strategy, a winning
response must be found for all possible moves by the losing player. This
includes very poor lines of play from the losing side, for which the AlphaZero
self-play process will not encounter. AlphaZero-based heuristics can be highly
inaccurate when evaluating these out-of-distribution positions, which occur
throughout the entire search. To address this issue, this paper investigates
applying online fine-tuning while searching and proposes two methods to learn
tailor-designed heuristics for game solving. Our experiments show that using
online fine-tuning can solve a series of challenging 7x7 Killall-Go problems,
using only 23.54% of computation time compared to the baseline without online
fine-tuning. Results suggest that the savings scale with problem size. Our
method can further be extended to any tree search algorithm for problem
solving. Our code is available at
https://rlg.iis.sinica.edu.tw/papers/neurips2023-online-fine-tuning-solver.Comment: Accepted by the 37th Conference on Neural Information Processing
Systems (NeurIPS 2023
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Deep learning and computer chess (part 2)
The dominant approach to computer chess has typically been through the use of
Minimax-based chess engines. In recent years, Monte Carlo Tree Search (MCTS) game
engines have seen success, with the advent of AlphaZero and Leela Chess Zero. However,
there is still much to explore regarding the use of MCTS in the domain of chess.
This paper evaluates the efficacy of an MCTS-based engine in the area of chess. On
top of the base MCTS engine, several enhancements were proposed and implemented,
including early playout termination, progressive bias, progressive unpruning, decisive moves,
epsilon-greedy search, score bounded Monte-Carlo tree search, and root parallelization. Each
enhancement was implemented in stages, and the performance of the enhancement was
measured by comparing it to the model from the previous stage.
It was determined that early playout termination, progressive unpruning, score
bounded search, and root parallelization were effective in improving the playing strength of
the engine. However, decisive moves and epsilon-greedy search negatively impacted the
engine’s performance.
From the results, it appears that it is possible to adapt an MCTS model to the realm of
chess through the aid of several enhancements such that can compete with the traditional
Minimax approach, with much room for improvement available.Bachelor of Engineering (Computer Science