143 research outputs found
MCTS-minimax hybrids with state evaluations
Monte-Carlo Tree Search (MCTS) has been found to show weaker play than minimax-based search in some tactical game domains. In order to combine the tactical strength of minimax and the strategic strength of MCTS, MCTS-minimax hybrids have been proposed in prior work. This arti
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Monte Carlo Tree Search (MCTS) has improved the performance of game engines
in domains such as Go, Hex, and general game playing. MCTS has been shown to
outperform classic alpha-beta search in games where good heuristic evaluations
are difficult to obtain. In recent years, combining ideas from traditional
minimax search in MCTS has been shown to be advantageous in some domains, such
as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new
way to use heuristic evaluations to guide the MCTS search by storing the two
sources of information, estimated win rates and heuristic evaluations,
separately. Rather than using the heuristic evaluations to replace the
playouts, our technique backs them up implicitly during the MCTS simulations.
These minimax values are then used to guide future simulations. We show that
using implicit minimax backups leads to stronger play performance in Kalah,
Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at
IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc
Класифікація способів покращення пошуку по дереву методом Монте-Карло, орієнтованих на особливості цього методу
У статті, на основі інформації з різних джерел про пошук по дереву методом Монте-Карло (MCTS), пропонується уточнена структура класифікації та перша версія класифікації способів покращення базової реалізації методу MCTS. У цій версії, на даний момент, розглянуті тільки суто теоретичні способи покращення етапів загальної схеми MCTS, які орієнтовані на особливості роботи цього методу. Передбачається, що запропонована класифікація може бути в подальшому розширена і використана для систематизації знань про метод MCTS та виявлення нових можливостей його покращення.In the article basing on information taken from various sources about Monte-Carlo tree search (MCTS) method, the updated structure of classification and the first version of just the classification of improvement techniques of the basic MCTS method implementation are proposed. For the moment, this version of the classification discusses only pure theoretical techniques for improving of steps of the general MCTS schema, which are oriented to specifics of the method. It is supposed that the proposed classification can be used for systematization of knowledge about MCTS method and discovering of new possibilities for its improvement
Hybrid Minimax-MCTS and Difficulty Adjustment for General Game Playing
Board games are a great source of entertainment for all ages, as they create
a competitive and engaging environment, as well as stimulating learning and
strategic thinking. It is common for digital versions of board games, as any
other type of digital games, to offer the option to select the difficulty of
the game. This is usually done by customizing the search parameters of the AI
algorithm. However, this approach cannot be extended to General Game Playing
agents, as different games might require different parametrization for each
difficulty level. In this paper, we present a general approach to implement an
artificial intelligence opponent with difficulty levels for zero-sum games,
together with a propose of a Minimax-MCTS hybrid algorithm, which combines the
minimax search process with GGP aspects of MCTS. This approach was tested in
our mobile application LoBoGames, an extensible board games platform, that is
intended to have an broad catalog of games, with an emphasis on accessibility:
the platform is friendly to visually-impaired users, and is compatible with
more than 92\% of Android devices. The tests in this work indicate that both
the hybrid Minimax-MCTS and the new difficulty adjustment system are promising
GGP approaches that could be expanded in future work
Action Guidance with MCTS for Deep Reinforcement Learning
Deep reinforcement learning has achieved great successes in recent years,
however, one main challenge is the sample inefficiency. In this paper, we focus
on how to use action guidance by means of a non-expert demonstrator to improve
sample efficiency in a domain with sparse, delayed, and possibly deceptive
rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a
new framework where even a non-expert simulated demonstrator, e.g., planning
algorithms such as Monte Carlo tree search with a small number rollouts, can be
integrated within asynchronous distributed deep reinforcement learning methods.
Compared to a vanilla deep RL algorithm, our proposed methods both learn faster
and converge to better policies on a two-player mini version of the Pommerman
game.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital
Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with
arXiv:1904.05759, arXiv:1812.0004
Application of the Monte-Carlo Tree Search to Multi-Action Turn-Based Games with Hidden Information
Traditional search algorithms struggle when applied to complex multi-action turn-based games. The introduction of hidden information further increases domain complexity. The Monte-Carlo Tree Search (MCTS) algorithm has previously been applied to multi-action turn-based games, but not multi-action turn-based games with hidden information. This thesis compares several Monte Carlo Tree Search (MCTS) extensions (Determinized/Perfect Information Monte Carlo, Multi-Observer Information Set MCTS, and Belief State MCTS) in TUBSTAP, an open-source multi-action turn-based game, modified to include hidden information via fog-of-war
Epaminondas: Exploring Combat Tactics
Epaminondas is a two-person, zero-sum strategy game that combines long-term strategic play with highly tactical move sequences. The game has two unique features that make it stand out from other games. The first feature is the creation of phalanxes, which are groups of pieces that can move as a whole unit. As the number of pieces in a phalanx increases, the mobility and capturing power of the phalanx also increases. The second feature differs from many other strategy games: when a player makes a crossing, a winning move in the game, the second player has an opportunity to respond. This paper presents strategies and heuristics used in a Min-Max Alpha-Beta agent that plays at a novice level. Furthermore, it defines the state-space and game-tree complexities for Epaminondas. Finally, a new version of MCTS is implemented that uses the Alpha-Beta heuristic function during node selection to guide MCTS to more promising areas of the search tree. Additionally, in an effort to overcome the MCTS tactical weakness, the MCTS player implements the Alpha-Beta search once the game reaches 15 turns. Results show that the added heuristic value and the switch to Alpha-Beta for endgame play, positively impact the performance of MCTS, surpassing novice Alpha-Beta win ratios at certain time intervals
- …