Search CORE

5 research outputs found

Continuous Upper Con dence Trees

Author: Bonnard Nicolas
Couetoux Adrien
Hoock Jean-Baptiste
Sokolovska Nataliya
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 17/01/2011
Field of study

International audienceUpper Con dence Trees are a very e cient tool for solving Markov Decision Processes; originating in di cult games like the game of Go, it is in particular surprisingly e cient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Con dence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Con dence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want in nitely many simulations for each action/state) and bias (we want su ciently many nodes to avoid a bias by the rst nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Crossref

Optimal robust expensive optimization is tractable

Author: Rolet Philippe
Sebag Michèle
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceFollowing a number of recent papers investigating the possibility of optimal comparison-based optimization algorithms for a given distribution of probability on fitness functions, we (i) discuss the comparison-based constraints (ii) choose a setting in which theoretical tight bounds are known (iii) develop a careful implementation using billiard algorithms, Upper Confidence trees and (iv) experimentally test the tractability of the approach. The results, on still very simple cases, show that the approach, yet still preliminary, could be tested successfully until dimension 10 and horizon 50 iterations within a few hours on a standard computer, with convergence rate far better than the best algorithms

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Optimal robust expensive optimization is tractable

Author: Rolet Philippe
Sebag Michèle
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

Hal-Diderot