56 research outputs found
MCTS-minimax hybrids with state evaluations
Monte-Carlo Tree Search (MCTS) has been found to show weaker play than minimax-based search in some tactical game domains. In order to combine the tactical strength of minimax and the strategic strength of MCTS, MCTS-minimax hybrids have been proposed in prior work. This arti
Explainable search
Search-based AI agents are state of the art in many challenging sequential decision-making domains. However, contemporary approaches lack the ability to explain, summarize, or visualize their plans and decisions, and how they are derived from traversing complex spaces of possible futures, contingencies, and eventualities, spanned by the available actions of the agent. This limits human trust in high-stakes scenarios, as well as effective human-AI collaboration. In this paper, we pr
Novelty and MCTS
Novelty search has become a popular technique in different fields such as evolutionary computing, classical AI planning, and deep reinforcement learning. Searching for novelty instead of, or in addition to, directly maximizing the search objective, aims at avoiding dead ends and local minima, and overall improving exploration. We propose and test the integration of novelty into Monte Carlo Tree Search (MCTS), a state-of-the-art framework for online RL planning, by linearly combining value estim
ME-MCTS: Online generalization by combining multiple value estimators
This paper addresses the challenge of online gen-
eralization in tree search. We propose Multiple
Estimator Monte Carlo Tree Search (ME-MCTS),
with a two-fold contribution: first, we introduce a
formalization of online generalization that can rep-
resent existing techniques such as “history heuris-
tics”, “RAVE”, or “OMA” – contextual action value
estimators or abstractors that generalize across spe-
cific contexts. Second, we incorporate recent ad-
vances in estimator averaging that enable guiding
search by combining the online action value esti-
mates of any number of such abstractors or sim-
ilar types of action value estimators. Unlike pre-
vious work, which usually proposed a single ab-
stractor for either the selection or the rollout phase
of MCTS simulations, our approach focuses on
the combination of multiple estimators and applies
them to all move choices in MCTS simulations. As
the MCTS tree itself is just another value estima-
tor – unbiased, but without abstraction – this blurs
the traditional distinction between action choices
inside and outside of the MCTS tree. Experi-
ments with three abstractors in four board games
show significant improvements of ME-MCTS over
MCTS using only a single abstractor, both for
MCTS with random rollouts as well as for MCTS
with static evaluation functions. While we used
deterministic, fully observable games, ME-MCTS
naturally extends to more challenging settings
Guiding multiplayer MCTS by focusing on yourself
In n-player sequential move games, the second root-player move appears at tree depth n + 1. Depending on n and time, tree search techniques can struggle to expand the game tree deeply enough to find multiple-move plans of the root player, which is often more important for strategic play than considering every possible opponent move in between. The minimax-based Paranoid search and BRS+ algorithms currently achieve state-of-the-art performance, especially at short time settings, by using a generally incorrect opponent model.
Towards explainable MCTS
Monte-Carlo Tree Search (MCTS) is a family of sampling-based search algorithms widely used for online planning in sequential decision-making domains, and at the heart of many recent breakthroughs in AI. Understanding the behavior of MCTS agents is non-trivial for developers and users, as it results from often large and complex search trees, consisting of many simulated possible futures, their evaluations, and relationships to each other. This p
- …