5 research outputs found

    Continuous Upper Con dence Trees

    Get PDF
    International audienceUpper Con dence Trees are a very e cient tool for solving Markov Decision Processes; originating in di cult games like the game of Go, it is in particular surprisingly e cient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Con dence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Con dence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want in nitely many simulations for each action/state) and bias (we want su ciently many nodes to avoid a bias by the rst nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Optimal robust expensive optimization is tractable

    Get PDF
    International audienceFollowing a number of recent papers investigating the possibility of optimal comparison-based optimization algorithms for a given distribution of probability on fitness functions, we (i) discuss the comparison-based constraints (ii) choose a setting in which theoretical tight bounds are known (iii) develop a careful implementation using billiard algorithms, Upper Confidence trees and (iv) experimentally test the tractability of the approach. The results, on still very simple cases, show that the approach, yet still preliminary, could be tested successfully until dimension 10 and horizon 50 iterations within a few hours on a standard computer, with convergence rate far better than the best algorithms

    Optimal robust expensive optimization is tractable

    No full text
    International audienceFollowing a number of recent papers investigating the possibility of optimal comparison-based optimization algorithms for a given distribution of probability on fitness functions, we (i) discuss the comparison-based constraints (ii) choose a setting in which theoretical tight bounds are known (iii) develop a careful implementation using billiard algorithms, Upper Confidence trees and (iv) experimentally test the tractability of the approach. The results, on still very simple cases, show that the approach, yet still preliminary, could be tested successfully until dimension 10 and horizon 50 iterations within a few hours on a standard computer, with convergence rate far better than the best algorithms
    corecore