16,432 research outputs found
Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation
Monte Carlo tree search (MCTS) is extremely popular in computer Go which
determines each action by enormous simulations in a broad and deep search tree.
However, human experts select most actions by pattern analysis and careful
evaluation rather than brute search of millions of future nteractions. In this
paper, we propose a computer Go system that follows experts way of thinking and
playing. Our system consists of two parts. The first part is a novel deep
alternative neural network (DANN) used to generate candidates of next move.
Compared with existing deep convolutional neural network (DCNN), DANN inserts
recurrent layer after each convolutional layer and stacks them in an
alternative manner. We show such setting can preserve more contexts of local
features and its evolutions which are beneficial for move prediction. The
second part is a long-term evaluation (LTE) module used to provide a reliable
evaluation of candidates rather than a single probability from move predictor.
This is consistent with human experts nature of playing since they can foresee
tens of steps to give an accurate estimation of candidates. In our system, for
each candidate, LTE calculates a cumulative reward after several future
interactions when local variations are settled. Combining criteria from the two
parts, our system determines the optimal choice of next move. For more
comprehensive experiments, we introduce a new professional Go dataset (PGD),
consisting of 253233 professional records. Experiments on GoGoD and PGD
datasets show the DANN can substantially improve performance of move prediction
over pure DCNN. When combining LTE, our system outperforms most relevant
approaches and open engines based on MCTS.Comment: AAAI 201
Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play
Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge
Pit platform. We briefly describe the scope and background of this competition
in the context of a more general project related to the development of an AI
engine for video games, called Grail. We also discuss the outcomes of this
challenge and demonstrate how predictive models for the assessment of player's
winning chances can be utilized in a construction of an intelligent agent for
playing Hearthstone. Finally, we show a few selected machine learning
approaches for modeling state and action values in Hearthstone. We provide
evaluation for a few promising solutions that may be used to create more
advanced types of agents, especially in conjunction with Monte Carlo Tree
Search algorithms.Comment: Federated Conference on Computer Science and Information Systems,
Prague (FedCSIS-2017) (Prague, Czech Republic
Improved Reinforcement Learning with Curriculum
Humans tend to learn complex abstract concepts faster if examples are
presented in a structured manner. For instance, when learning how to play a
board game, usually one of the first concepts learned is how the game ends,
i.e. the actions that lead to a terminal state (win, lose or draw). The
advantage of learning end-games first is that once the actions which lead to a
terminal state are understood, it becomes possible to incrementally learn the
consequences of actions that are further away from a terminal state - we call
this an end-game-first curriculum. Currently the state-of-the-art machine
learning player for general board games, AlphaZero by Google DeepMind, does not
employ a structured training curriculum; instead learning from the entire game
at all times. By employing an end-game-first training curriculum to train an
AlphaZero inspired player, we empirically show that the rate of learning of an
artificial player can be improved during the early stages of training when
compared to a player not using a training curriculum.Comment: Draft prior to submission to IEEE Trans on Games. Changed paper
slightl
Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in
building systems that learn and think like people. Many advances have come from
using deep neural networks trained end-to-end in tasks such as object
recognition, video games, and board games, achieving performance that equals or
even beats humans in some respects. Despite their biological inspiration and
performance achievements, these systems differ from human intelligence in
crucial ways. We review progress in cognitive science suggesting that truly
human-like learning and thinking machines will have to reach beyond current
engineering trends in both what they learn, and how they learn it.
Specifically, we argue that these machines should (a) build causal models of
the world that support explanation and understanding, rather than merely
solving pattern recognition problems; (b) ground learning in intuitive theories
of physics and psychology, to support and enrich the knowledge that is learned;
and (c) harness compositionality and learning-to-learn to rapidly acquire and
generalize knowledge to new tasks and situations. We suggest concrete
challenges and promising routes towards these goals that can combine the
strengths of recent neural network advances with more structured cognitive
models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary
proposals (until Nov. 22, 2016).
https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar
Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories
With the explosion in the availability of spatio-temporal tracking data in
modern sports, there is an enormous opportunity to better analyse, learn and
predict important events in adversarial group environments. In this paper, we
propose a deep decision tree architecture for discriminative dictionary
learning from adversarial multi-agent trajectories. We first build up a
hierarchy for the tree structure by adding each layer and performing feature
weight based clustering in the forward pass. We then fine tune the player role
weights using back propagation. The hierarchical architecture ensures the
interpretability and the integrity of the group representation. The resulting
architecture is a decision tree, with leaf-nodes capturing a dictionary of
multi-agent group interactions. Due to the ample volume of data available, we
focus on soccer tracking data, although our approach can be used in any
adversarial multi-agent domain. We present applications of proposed method for
simulating soccer games as well as evaluating and quantifying team strategies.Comment: To appear in 4th International Workshop on Computer Vision in Sports
(CVsports) at CVPR 201
- …