Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

Busatto-Gaston, Damien; Chakraborty, Debraj; Raskin, Jean-Francois

Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

Authors: Damien Busatto-Gaston
Debraj Chakraborty
Jean-Francois Raskin
Publication date: 1 January 2020
Publisher: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Conference on Concurrency Theory (CONCUR 2020)
Doi

Abstract

In this paper, we consider the online computation of a strategy that aims at optimizing the expected average reward in a Markov decision process. The strategy is computed with a receding horizon and using Monte Carlo tree search (MCTS). We augment the MCTS algorithm with the notion of symbolic advice, and show that its classical theoretical guarantees are maintained. Symbolic advice are used to bias the selection and simulation strategies of MCTS. We describe how to use QBF and SAT solvers to implement symbolic advice in an efficient way. We illustrate our new algorithm using the popular game Pac-Man and show that the performances of our algorithm exceed those of plain MCTS as well as the performances of human players

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DI-fusion

oai:dipot.ulb.ac.be:2013/31487...

Last time updated on 09/12/2020

Dagstuhl Research Online Publication Server

oai:drops-oai.dagstuhl.de:1285...

Last time updated on 21/11/2020