5,640 research outputs found

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Efficient likelihood evaluation of state-space representations

    Get PDF
    We develop a numerical procedure that facilitates efficient likelihood evaluation in applications involving non-linear and non-Gaussian state-space models. The procedure approximates necessary integrals using continuous approximations of target densities. Construction is achieved via efficient importance sampling, and approximating densities are adapted to fully incorporate current information. We illustrate our procedure in applications to dynamic stochastic general equilibrium models. --particle filter,adaption,efficient importance sampling,kernel density approximation,dynamic stochastic general equilibrium model

    Particle Swarm Optimisation of Spoken Dialogue System Strategies

    No full text
    International audienceDialogue management optimisation has been cast into a plan- ning under uncertainty problem for long. Some methods such as Reinforcement Learning (RL) are now part of the state of the art. Whatever the solving method, strong assumptions are made about the dialogue system properties. For instance, RL assumes that the dialogue state space is Markovian. Such con- straints may involve important engineering work. This paper introduces a more general approach, based on fewer modelling assumptions. A Black Box Optimisation (BBO) method and more precisely a Particle Swarm Optimisation (PSO) is used to solve the control problem. In addition, PSO allows taking ad- vantage of the parallel aspect of the problem of optimising a system online with many users calling at the same time. Some preliminary results are presented

    Optimisation de contrôleurs par essaim particulaire

    No full text
    http://cap2012.loria.fr/pub/Papers/10.pdfNational audienceTrouver des contrôleurs optimaux pour des systèmes stochastiques est un problème particulièrement difficile abordé dans les communautés d'apprentissage par renforcement et de contrôle optimal. Le paradigme classique employé pour résoudre ces problèmes est celui des processus décisionnel de Markov. Néanmoins, le problème d'optimisation qui en découle peut être difficile à résoudre. Dans ce papier, nous explorons l'utilisation de l'optimisation par essaim particulaire pour apprendre des contrôleurs optimaux. Nous l'appliquons en particulier à trois problèmes classiques : le pendule inversé, le mountain car et le double pendule
    corecore