3,820 research outputs found
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play
Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge
Pit platform. We briefly describe the scope and background of this competition
in the context of a more general project related to the development of an AI
engine for video games, called Grail. We also discuss the outcomes of this
challenge and demonstrate how predictive models for the assessment of player's
winning chances can be utilized in a construction of an intelligent agent for
playing Hearthstone. Finally, we show a few selected machine learning
approaches for modeling state and action values in Hearthstone. We provide
evaluation for a few promising solutions that may be used to create more
advanced types of agents, especially in conjunction with Monte Carlo Tree
Search algorithms.Comment: Federated Conference on Computer Science and Information Systems,
Prague (FedCSIS-2017) (Prague, Czech Republic
Biasing MCTS with Features for General Games
This paper proposes using a linear function approximator, rather than a deep
neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for
general games. This is unlikely to match the potential raw playing strength of
DNNs, but has advantages in terms of generality, interpretability and resources
(time and hardware) required for training. Features describing local patterns
are used as inputs. The features are formulated in such a way that they are
easily interpretable and applicable to a wide range of general games, and might
encode simple local strategies. We gradually create new features during the
same self-play training process used to learn feature weights. We evaluate the
playing strength of an MCTS player biased by learnt features against a standard
upper confidence bounds for trees (UCT) player in multiple different board
games, and demonstrate significantly improved playing strength in the majority
of them after a small number of self-play training games.Comment: Accepted at IEEE CEC 2019, Special Session on Games. Copyright of
final version held by IEE
Ensemble decision systems for general video game playing
Ensemble Decision Systems offer a unique form of decision making that allows
a collection of algorithms to reason together about a problem. Each individual
algorithm has its own inherent strengths and weaknesses, and often it is
difficult to overcome the weaknesses while retaining the strengths. Instead of
altering the properties of the algorithm, the Ensemble Decision System augments
the performance with other algorithms that have complementing strengths. This
work outlines different options for building an Ensemble Decision System as
well as providing analysis on its performance compared to the individual
components of the system with interesting results, showing an increase in the
generality of the algorithms without significantly impeding performance.Comment: 8 Pages, Accepted at COG201
Improved Reinforcement Learning with Curriculum
Humans tend to learn complex abstract concepts faster if examples are
presented in a structured manner. For instance, when learning how to play a
board game, usually one of the first concepts learned is how the game ends,
i.e. the actions that lead to a terminal state (win, lose or draw). The
advantage of learning end-games first is that once the actions which lead to a
terminal state are understood, it becomes possible to incrementally learn the
consequences of actions that are further away from a terminal state - we call
this an end-game-first curriculum. Currently the state-of-the-art machine
learning player for general board games, AlphaZero by Google DeepMind, does not
employ a structured training curriculum; instead learning from the entire game
at all times. By employing an end-game-first training curriculum to train an
AlphaZero inspired player, we empirically show that the rate of learning of an
artificial player can be improved during the early stages of training when
compared to a player not using a training curriculum.Comment: Draft prior to submission to IEEE Trans on Games. Changed paper
slightl
Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar
Automatic machine learning is an important problem in the forefront of
machine learning. The strongest AutoML systems are based on neural networks,
evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached
state-of-the-art results with an order of magnitude speedup using reinforcement
learning with self-play. In this work we extend AlphaD3M by using a pipeline
grammar and a pre-trained model which generalizes from many different datasets
and similar tasks. Our results demonstrate improved performance compared with
our earlier work and existing methods on AutoML benchmark datasets for
classification and regression tasks. In the spirit of reproducible research we
make our data, models, and code publicly available.Comment: ICML Workshop on Automated Machine Learnin
- …