Search CORE

887 research outputs found

Multiple Overlapping Tiles for Contextual Monte Carlo Tree Search

Author: E.A. Sherstov
G. Chaslot
J. Pearl
L. Kocsis
P. Auer
R. Sutton
T. Lai
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceMonte Carlo Tree Search is a recent algorithm that achieves more and more successes in various domains. We propose an improvement of the Monte Carlo part of the algorithm by modifying the simulations depending on the context. The modiﬁcation is based on a reward function learned on a tiling of the space of Monte Carlo simulations. The tiling is done by regrouping the Monte Carlo simulations where two moves have been selected by one player. We show that it is very eﬃcient by experimenting on the game of Havannah

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Crossref

Automatic Feature Engineering through Monte Carlo Tree Search

Author: Beigl Michael
Fang Likun
Hefenbrock Michael
Huang Yiran
Riedel Till
Zhou Yexu
Publication venue: arxiv
Publication date: 16/11/2022
Field of study

The performance of machine learning models depends heavily on the feature space and feature engineering. Although neural networks have made significant progress in learning latent feature spaces from data, compositional feature engineering through nested feature transformations can reduce model complexity and can be particularly desirable for interpretability. To find suitable transformations automatically, state-of-the-art methods model the feature transformation space by graph structures and use heuristics such as

\epsilon

-greedy to search for them. Such search strategies tend to become less efficient over time because they do not consider the sequential information of the candidate sequences and cannot dynamically adjust the heuristic strategy. To address these shortcomings, we propose a reinforcement learning-based automatic feature engineering method, which we call Monte Carlo tree search Automatic Feature Engineering (mCAFE). We employ a surrogate model that can capture the sequential information contained in the transformation sequence and thus can dynamically adjust the exploration strategy. It balances exploration and exploitation by Thompson sampling and uses a Long Short Term Memory (LSTM) based surrogate model to estimate sequences of promising transformations. In our experiments, mCAFE outperformed state-of-the-art automatic feature engineering methods on most common benchmark datasets

KITopen

Learning a Move-Generator for Upper Con dence Trees

Author: Couetoux Adrien
Doghmen Hassen
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 12/12/2012
Field of study

International audienceWe experiment the introduction of machine learning tools to improve Monte-Carlo Tree Search. More precisely, we propose the use of Direct Policy Search, a classical reinforcement learning paradigm, to learn the Monte-Carlo Move Generator. We experiment our algorithm on di erent forms of unit commitment problems, including experiments on a problem with both macrolevel and microlevel decisions

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Deep learning for semantic segmentation of airborne laser scanning point clouds

Author: Lin Yaping
Publication venue: University of Twente
Publication date: 01/01/2022
Field of study

University of Twente Research Information

On monte carlo tree search and reinforcement learning

Author: Brank Ster
Samothrakis Spyridon
Tom Vodopivec
Publication venue: 'AI Access Foundation'
Publication date: 20/12/2017
Field of study

Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

University of Essex Research Repository

Crossref

Forest point processes for the automatic extraction of networks in raster data

Author: Brenner Claus
Heipke Christian
Lafarge Florent
Rottensteiner Franz
Schmidt Alena
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

International audienceIn this paper, we propose a new stochastic approach for the automatic detection of network structures in raster data. We represent a network as a set of trees with acyclic planar graphs. We embed this model in the probabilistic framework of spatial point processes and determine the most probable configuration of trees by stochastic sampling. That is, different configurations are constructed randomly by modifying the graph parameters and by adding or removing nodes and edges to/ from the current trees. Each configuration is evaluated based on the probabilities for these changes and an energy function describing the conformity with a predefined model. By using the Reversible jump Markov chain Monte Carlo sampler, an approximation of the global optimum of the energy function is iteratively reached. Although our main target application is the extraction of rivers and tidal channels in digital terrain models, experiments with other types of networks in images show the transferability to further applications. Qualitative and quantitative evaluations demonstrate the competitiveness of our approach with respect to existing algorithms

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Application of advanced tree search and proximal policy optimization on formula-E race strategy development

Author: Auger Daniel
Fotouhi Abbas
Liu Xuze
Publication venue: 'Elsevier BV'
Publication date: 25/02/2022
Field of study

Energy and thermal management is a crucial element in Formula-E race strategy development. Most published literature focuses on the optimal management strategy for a single lap and results in sub-optimal solutions to the larger multi-lap problem. In this study, two Monte Carlo Tree Search (MCTS) enhancement techniques are proposed for multi-lap Formula-E racing strategy development. It is shown that using the bivariate Gaussian distribution enhancement, race finishing time improves by at least 0.25% and its variance reduces by more than 26%. Compared to the published conventional MCTS technique used in multi-lap problems, this proposed technique is proved to bring a remarkable enhancement with no additional computational time cost. By further enhancing the MCTS using proximal policy optimization, the final product is capable of generating more than 0.5% quicker race time solutions and improving the consistency by over 90% which makes it a very suitable method particularly when enough training time is guarantee

Cranfield CERES

Probabilistic and Deep Learning Algorithms for the Analysis of Imagery Data

Author: Basu Saikat
Publication venue: LSU Digital Commons
Publication date: 01/01/2016
Field of study

Accurate object classification is a challenging problem for various low to high resolution imagery data. This applies to both natural as well as synthetic image datasets. However, each object recognition dataset poses its own distinct set of domain-specific problems. In order to address these issues, we need to devise intelligent learning algorithms which require a deep understanding and careful analysis of the feature space. In this thesis, we introduce three new learning frameworks for the analysis of both airborne images (NAIP dataset) and handwritten digit datasets without and with noise (MNIST and n-MNIST respectively). First, we propose a probabilistic framework for the analysis of the NAIP dataset which includes (1) an unsupervised segmentation module based on the Statistical Region Merging algorithm, (2) a feature extraction module that extracts a set of standard hand-crafted texture features from the images, (3) a supervised classification algorithm based on Feedforward Backpropagation Neural Networks, and (4) a structured prediction framework using Conditional Random Fields that integrates the results of the segmentation and classification modules into a single composite model to generate the final class labels. Next, we introduce two new datasets SAT-4 and SAT-6 sampled from the NAIP imagery and use them to evaluate a multitude of Deep Learning algorithms including Deep Belief Networks (DBN), Convolutional Neural Networks (CNN) and Stacked Autoencoders (SAE) for generating class labels. Finally, we propose a learning framework by integrating hand-crafted texture features with a DBN. A DBN uses an unsupervised pre-training phase to perform initialization of the parameters of a Feedforward Backpropagation Neural Network to a global error basin which can then be improved using a round of supervised fine-tuning using Feedforward Backpropagation Neural Networks. These networks can subsequently be used for classification. In the following discussion, we show that the integration of hand-crafted features with DBN shows significant improvement in performance as compared to traditional DBN models which take raw image pixels as input. We also investigate why this integration proves to be particularly useful for aerial datasets using a statistical analysis based on Distribution Separability Criterion. Then we introduce a new dataset called noisy-MNIST (n-MNIST) by adding (1) additive white gaussian noise (AWGN), (2) motion blur and (3) Reduced contrast and AWGN to the MNIST dataset and present a learning algorithm by combining probabilistic quadtrees and Deep Belief Networks. This dynamic integration of the Deep Belief Network with the probabilistic quadtrees provide significant improvement over traditional DBN models on both the MNIST and the n-MNIST datasets. Finally, we extend our experiments on aerial imagery to the class of general texture images and present a theoretical analysis of Deep Neural Networks applied to texture classification. We derive the size of the feature space of textural features and also derive the Vapnik-Chervonenkis dimension of certain classes of Neural Networks. We also derive some useful results on intrinsic dimension and relative contrast of texture datasets and use these to highlight the differences between texture datasets and general object recognition datasets

Louisiana State University