9,122 research outputs found
Monte-Carlo tree search with heuristic knowledge: A novel way in solving capturing and life and death problems in Go
Monte-Carlo (MC) tree search is a new research field. Its effectiveness in searching large state spaces, such as the Go game tree, is well recognized in the computer Go community. Go domain- specific heuristics and techniques as well as domain-independent heuristics and techniques are sys- tematically investigated in the context of the MC tree search in this dissertation. The search extensions based on these heuristics and techniques can significantly improve the effectiveness and efficiency of the MC tree search.
Two major areas of investigation are addressed in this dissertation research: I. The identification and use of the effective heuristic knowledge in guiding the MC simulations, II. The extension of the MC tree search algorithm with heuristics. Go, the most challenging board game to the machine, serves as the test bed. The effectiveness of the MC tree search extensions is demonstrated through the performances of Go tactic problem solvers using these techniques.
The main contributions of this dissertation include:
1. A heuristics based Monte-Carlo tactic tree search framework is proposed to extend the standard
Monte-Carlo tree search.
2. (Go) Knowledge based heuristics are systematically investigated to improve the Monte-Carlo
tactic tree search.
3. Pattern learning is demonstrated as effective in improving the Monte-Carlo tactic tree search.
4. Domain knowledge independent tree search enhancements are shown as effective in improving
the Monte-Carlo tactic tree search performances.
5. A strong Go Tactic solver based on proposed algorithms outperforms traditional game tree
search algorithms.
The techniques developed in this dissertation research can benefit other game domains and ap-
plication fields
Monte Carlo Forest Search: UNSAT Solver Synthesis via Reinforcement learning
We introduce Monte Carlo Forest Search (MCFS), an offline algorithm for
automatically synthesizing strong tree-search solvers for proving
\emph{unsatisfiability} on given distributions, leveraging ideas from the Monte
Carlo Tree Search (MCTS) algorithm that led to breakthroughs in AlphaGo. The
crucial difference between proving unsatisfiability and existing applications
of MCTS, is that policies produce trees rather than paths. Rather than finding
a good path (solution) within a tree, the search problem becomes searching for
a small proof tree within a forest of candidate proof trees. We introduce two
key ideas to adapt to this setting. First, we estimate tree size with paths,
via the unbiased approximation from Knuth (1975). Second, we query a strong
solver at a user-defined depth rather than learning a policy across the whole
tree, in order to focus our policy search on early decisions, which offer the
greatest potential for reducing tree size. We then present MCFS-SAT, an
implementation of MCFS for learning branching policies for solving the Boolean
satisfiability (SAT) problem that required many modifications from AlphaGo. We
matched or improved performance over a strong baseline on two well-known SAT
distributions (\texttt{sgen}, \texttt{random}). Notably, we improved running
time by 9\% on \texttt{sgen} over the \texttt{kcnfs} solver and even further
over the strongest UNSAT solver from the 2021 SAT competition
Hybridizing Constraint Programming and Monte-Carlo Tree Search: Application to the Job Shop problem
International audienceConstraint Programming (CP) solvers classically explore the solution space using tree search-based heuristics. Monte-Carlo Tree-Search (MCTS), a tree-search based method aimed at sequential decision making under uncertainty, simultaneously estimates the reward associated to the sub-trees, and gradually biases the exploration toward the most promising regions. This paper examines the tight combination of MCTS and CP on the job shop problem (JSP). The contribution is twofold. Firstly, a reward function compliant with the CP setting is proposed. Secondly, a biased MCTS node-selection rule based on this reward is proposed, that is suitable in a multiple-restarts context. Its integration within the Gecode constraint solver is shown to compete with JSP-specific CP approaches on difficult JSP instances
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Monte Carlo Tree Search (MCTS) has improved the performance of game engines
in domains such as Go, Hex, and general game playing. MCTS has been shown to
outperform classic alpha-beta search in games where good heuristic evaluations
are difficult to obtain. In recent years, combining ideas from traditional
minimax search in MCTS has been shown to be advantageous in some domains, such
as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new
way to use heuristic evaluations to guide the MCTS search by storing the two
sources of information, estimated win rates and heuristic evaluations,
separately. Rather than using the heuristic evaluations to replace the
playouts, our technique backs them up implicitly during the MCTS simulations.
These minimax values are then used to guide future simulations. We show that
using implicit minimax backups leads to stronger play performance in Kalah,
Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at
IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
The 2014 International Planning Competition: Progress and Trends
We review the 2014 International Planning Competition (IPC-2014), the eighth
in a series of competitions starting in 1998. IPC-2014 was held in three separate
parts to assess state-of-the-art in three prominent areas of planning research: the
deterministic (classical) part (IPCD), the learning part (IPCL), and the probabilistic
part (IPPC). Each part evaluated planning systems in ways that pushed the edge of
existing planner performance by introducing new challenges, novel tasks, or both.
The competition surpassed again the number of competitors than its predecessor,
highlighting the competition’s central role in shaping the landscape of ongoing
developments in evaluating planning systems
Online algorithms for POMDPs with continuous state, action, and observation spaces
Online solvers for partially observable Markov decision processes have been
applied to problems with large discrete state spaces, but continuous state,
action, and observation spaces remain a challenge. This paper begins by
investigating double progressive widening (DPW) as a solution to this
challenge. However, we prove that this modification alone is not sufficient
because the belief representations in the search tree collapse to a single
particle causing the algorithm to converge to a policy that is suboptimal
regardless of the computation time. This paper proposes and evaluates two new
algorithms, POMCPOW and PFT-DPW, that overcome this deficiency by using
weighted particle filtering. Simulation results show that these modifications
allow the algorithms to be successful where previous approaches fail.Comment: Added Multilane sectio
- …