6,540 research outputs found
Finding Competitive Network Architectures Within a Day Using UCT
The design of neural network architectures for a new data set is a laborious
task which requires human deep learning expertise. In order to make deep
learning available for a broader audience, automated methods for finding a
neural network architecture are vital. Recently proposed methods can already
achieve human expert level performances. However, these methods have run times
of months or even years of GPU computing time, ignoring hardware constraints as
faced by many researchers and companies. We propose the use of Monte Carlo
planning in combination with two different UCT (upper confidence bound applied
to trees) derivations to search for network architectures. We adapt the UCT
algorithm to the needs of network architecture search by proposing two ways of
sharing information between different branches of the search tree. In an
empirical study we are able to demonstrate that this method is able to find
competitive networks for MNIST, SVHN and CIFAR-10 in just a single GPU day.
Extending the search time to five GPU days, we are able to outperform human
architectures and our competitors which consider the same types of layers
Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation
Monte Carlo tree search (MCTS) is extremely popular in computer Go which
determines each action by enormous simulations in a broad and deep search tree.
However, human experts select most actions by pattern analysis and careful
evaluation rather than brute search of millions of future nteractions. In this
paper, we propose a computer Go system that follows experts way of thinking and
playing. Our system consists of two parts. The first part is a novel deep
alternative neural network (DANN) used to generate candidates of next move.
Compared with existing deep convolutional neural network (DCNN), DANN inserts
recurrent layer after each convolutional layer and stacks them in an
alternative manner. We show such setting can preserve more contexts of local
features and its evolutions which are beneficial for move prediction. The
second part is a long-term evaluation (LTE) module used to provide a reliable
evaluation of candidates rather than a single probability from move predictor.
This is consistent with human experts nature of playing since they can foresee
tens of steps to give an accurate estimation of candidates. In our system, for
each candidate, LTE calculates a cumulative reward after several future
interactions when local variations are settled. Combining criteria from the two
parts, our system determines the optimal choice of next move. For more
comprehensive experiments, we introduce a new professional Go dataset (PGD),
consisting of 253233 professional records. Experiments on GoGoD and PGD
datasets show the DANN can substantially improve performance of move prediction
over pure DCNN. When combining LTE, our system outperforms most relevant
approaches and open engines based on MCTS.Comment: AAAI 201
?????? ?????? ??????????????? ?????? ????????????
Department of Computer Science and EngineeringRecently deep reinforcement learning (DRL) algorithms show super human performances in the simulated game domains. In practical points, the sample efficiency is also one of the most important measures to determine the performance of a model. Especially for the environment of large search spaces (e.g. continuous action space), it is very critical condition to achieve the state-of-the-art performance.
In this thesis, we design a model to be applicable to multi-end games in continuous space with high sample efficiency. A multi-end game has several sub-games which are independent each other but affect the result of the game by some rules of its domain. We verify the algorithm in the environment of simulated curling.clos
Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play
Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge
Pit platform. We briefly describe the scope and background of this competition
in the context of a more general project related to the development of an AI
engine for video games, called Grail. We also discuss the outcomes of this
challenge and demonstrate how predictive models for the assessment of player's
winning chances can be utilized in a construction of an intelligent agent for
playing Hearthstone. Finally, we show a few selected machine learning
approaches for modeling state and action values in Hearthstone. We provide
evaluation for a few promising solutions that may be used to create more
advanced types of agents, especially in conjunction with Monte Carlo Tree
Search algorithms.Comment: Federated Conference on Computer Science and Information Systems,
Prague (FedCSIS-2017) (Prague, Czech Republic
- …