Search CORE

30 research outputs found

Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation

Author: Gao Wen
Wang Jinzhuo
Wang Ronggang
Wang Wenmin
Publication venue
Publication date: 12/02/2017
Field of study

Monte Carlo tree search (MCTS) is extremely popular in computer Go which determines each action by enormous simulations in a broad and deep search tree. However, human experts select most actions by pattern analysis and careful evaluation rather than brute search of millions of future nteractions. In this paper, we propose a computer Go system that follows experts way of thinking and playing. Our system consists of two parts. The first part is a novel deep alternative neural network (DANN) used to generate candidates of next move. Compared with existing deep convolutional neural network (DCNN), DANN inserts recurrent layer after each convolutional layer and stacks them in an alternative manner. We show such setting can preserve more contexts of local features and its evolutions which are beneficial for move prediction. The second part is a long-term evaluation (LTE) module used to provide a reliable evaluation of candidates rather than a single probability from move predictor. This is consistent with human experts nature of playing since they can foresee tens of steps to give an accurate estimation of candidates. In our system, for each candidate, LTE calculates a cumulative reward after several future interactions when local variations are settled. Combining criteria from the two parts, our system determines the optimal choice of next move. For more comprehensive experiments, we introduce a new professional Go dataset (PGD), consisting of 253233 professional records. Experiments on GoGoD and PGD datasets show the DANN can substantially improve performance of move prediction over pure DCNN. When combining LTE, our system outperforms most relevant approaches and open engines based on MCTS.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recurrent Models of Visual Attention

Author: Alex Graves
Google Deepmind
Koray Kavukcuoglu
Nicolas Heess
Volodymyr Mnih
Publication venue
Publication date: 24/06/2014
Field of study

Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so

arXiv.org e-Print Archive

CiteSeerX

Deep Sequential Neural Network

Author: Denoyer Ludovic
Gallinari Patrick
Publication venue
Publication date: 02/10/2014
Field of study

Neural Networks sequentially build high-level features through their successive layers. We propose here a new neural network model where each layer is associated with a set of candidate mappings. When an input is processed, at each layer, one mapping among these candidates is selected according to a sequential decision process. The resulting model is structured according to a DAG like architecture, so that a path from the root to a leaf node defines a sequence of transformations. Instead of considering global transformations, like in classical multilayer networks, this model allows us for learning a set of local transformations. It is thus able to process data with different characteristics through specific sequences of such local transformations, increasing the expression power of this model w.r.t a classical multilayered network. The learning algorithm is inspired from policy gradient techniques coming from the reinforcement learning domain and is used here instead of the classical back-propagation based gradient descent techniques. Experiments on different datasets show the relevance of this approach

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Manifold-Based Robot Motion Generation

Author: Kobayashi Yuichi
Publication venue: 'IntechOpen'
Publication date: 05/11/2018
Field of study

In order to make an autonomous robot system more adaptive to human-centered environments, it is effective to let the robot collect sensor values by itself and build controller to reach a desired configuration autonomously. Multiple sensors are often available to estimate the state of the robot, but they contain two problems: (1) sensing ranges of each sensor might not overlap with each other and (2) sensor variable can contain redundancy against the original state space. Regarding the first problem, a local coordinate definition based on a sensor value and its extension to unobservable region is presented. This technique helps the robot to estimate the sensor variable outside of its observation range and to integrate regions of two sensors that do not overlap. For a solution to the second problem, a grid-based estimation of lower-dimensional subspace is presented. This estimation of manifold allows the robot to have a compact representation, and thus the proposed motion generation method can be applied to the redundant sensor system. In the case of image feature spaces with a high-dimensional sensor signal, a manifold estimation-based mapping, known as locally linear embedding (LLE), was applied to an estimation of distance between robot body and an obstacle

IntechOpen