Search CORE

5,640 research outputs found

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Efficient likelihood evaluation of state-space representations

Author: DeJong David Neil
Dharmarajan Hariharan
Liesenfeld Roman
Moura Guilherme V.
Richard Jean-François
Publication venue
Publication date
Field of study

We develop a numerical procedure that facilitates efficient likelihood evaluation in applications involving non-linear and non-Gaussian state-space models. The procedure approximates necessary integrals using continuous approximations of target densities. Construction is achieved via efficient importance sampling, and approximating densities are adapted to fully incorporate current information. We illustrate our procedure in applications to dynamic stochastic general equilibrium models. --particle filter,adaption,efficient importance sampling,kernel density approximation,dynamic stochastic general equilibrium model

Research Papers in Economics

Particle Swarm Optimisation of Spoken Dialogue System Strategies

Author: Daubigney Lucie
Geist Matthieu
Pietquin Olivier
Publication venue: HAL CCSD
Publication date: 25/08/2013
Field of study

International audienceDialogue management optimisation has been cast into a plan- ning under uncertainty problem for long. Some methods such as Reinforcement Learning (RL) are now part of the state of the art. Whatever the solving method, strong assumptions are made about the dialogue system properties. For instance, RL assumes that the dialogue state space is Markovian. Such con- straints may involve important engineering work. This paper introduces a more general approach, based on fewer modelling assumptions. A Black Box Optimisation (BBO) method and more precisely a Particle Swarm Optimisation (PSO) is used to solve the control problem. In addition, PSO allows taking ad- vantage of the parallel aspect of the problem of optimising a system online with many users calling at the same time. Some preliminary results are presented

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Optimisation de contrôleurs par essaim particulaire

Author: Fix Jérémy
Geist Matthieu
Publication venue: HAL CCSD
Publication date: 23/05/2012
Field of study

http://cap2012.loria.fr/pub/Papers/10.pdfNational audienceTrouver des contrôleurs optimaux pour des systèmes stochastiques est un problème particulièrement difficile abordé dans les communautés d'apprentissage par renforcement et de contrôle optimal. Le paradigme classique employé pour résoudre ces problèmes est celui des processus décisionnel de Markov. Néanmoins, le problème d'optimisation qui en découle peut être difficile à résoudre. Dans ce papier, nous explorons l'utilisation de l'optimisation par essaim particulaire pour apprendre des contrôleurs optimaux. Nous l'appliquons en particulier à trois problèmes classiques : le pendule inversé, le mountain car et le double pendule

HAL-CentraleSupelec

HAL-Rennes 1