409 research outputs found
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Assessing the Potential of Classical Q-learning in General Game Playing
After the recent groundbreaking results of AlphaGo and AlphaZero, we have
seen strong interests in deep reinforcement learning and artificial general
intelligence (AGI) in game playing. However, deep learning is
resource-intensive and the theory is not yet well developed. For small games,
simple classical table-based Q-learning might still be the algorithm of choice.
General Game Playing (GGP) provides a good testbed for reinforcement learning
to research AGI. Q-learning is one of the canonical reinforcement learning
methods, and has been used by (Banerjee Stone, IJCAI 2007) in GGP. In this
paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe,
Connect Four, Hex)\footnote{source code: https://github.com/wh1992v/ggp-rl}, to
allow comparison to Banerjee et al.. We find that Q-learning converges to a
high win rate in GGP. For the -greedy strategy, we propose a first
enhancement, the dynamic algorithm. In addition, inspired by (Gelly
Silver, ICML 2007) we combine online search (Monte Carlo Search) to
enhance offline learning, and propose QM-learning for GGP. Both enhancements
improve the performance of classical Q-learning. In this work, GGP allows us to
show, if augmented by appropriate enhancements, that classical table-based
Q-learning can perform well in small games.Comment: arXiv admin note: substantial text overlap with arXiv:1802.0594
Shallow decision-making analysis in General Video Game Playing
The General Video Game AI competitions have been the testing ground for
several techniques for game playing, such as evolutionary computation
techniques, tree search algorithms, hyper heuristic based or knowledge based
algorithms. So far the metrics used to evaluate the performance of agents have
been win ratio, game score and length of games. In this paper we provide a
wider set of metrics and a comparison method for evaluating and comparing
agents. The metrics and the comparison method give shallow introspection into
the agent's decision making process and they can be applied to any agent
regardless of its algorithmic nature. In this work, the metrics and the
comparison method are used to measure the impact of the terms that compose a
tree policy of an MCTS based agent, comparing with several baseline agents. The
results clearly show how promising such general approach is and how it can be
useful to understand the behaviour of an AI agent, in particular, how the
comparison with baseline agents can help understanding the shape of the agent
decision landscape. The presented metrics and comparison method represent a
step toward to more descriptive ways of logging and analysing agent's
behaviours
Exploiting Game Decompositions in Monte Carlo Tree Search
International audienceIn this paper, we propose a variation of the MCTS framework to perform a search in several trees to exploit game decompositions. Our Multiple Tree MCTS (MT-MCTS) approach builds simultaneously multiple MCTS trees corresponding to the different sub-games and allows , like MCTS algorithms, to evaluate moves while playing. We apply MT-MCTS on decomposed games in the General Game Playing framework. We present encouraging results on single player games showing that this approach is promising and opens new avenues for further research in the domain of decomposition exploitation. Complex compound games are solved from 2 times faster (Incredible) up to 25 times faster (Nonogram)
Enhancing automated red teaming with Monte Carlo Tree Search
This study has investigated novel Automated Red Teaming methods that support replanning. Traditional Automated Red Teaming (ART) approaches usually use evolutionary computing methods for evolving plans using simulations. A drawback of this method is the inability to change a team’s strategy part way through a simulation. This study focussed on a Monte-Carlo Tree Search (MCTS) method in an ART environment that supports re-planning to lead to better strategy decisions and a higher average scor
Assessing the Potential of Classical Q-learning in General Game Playing
After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. We find that Q-learning converges to a high win rate in GGP. For the ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ-greedy strategy, we propose a first enhancement, the dynamic ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ algorithm. In addition, inspired by (Gelly & Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.Computer Systems, Imagery and Medi
- …