Search CORE

1,023 research outputs found

Efficient Reinforcement Learning in Factored MDPs

Author: Michael Kearns
Publication venue
Publication date: 01/01/1999
Field of study

We present a provably efficient and near-optimal algorithm for reinforcement learning in Markov decision processes (MDPs) whose transition model can be factored as a dynamic Bayesian network (DBN). Our algorithm generalizes the recent E 3 algorithm of Kearns and Singh, and assumes that we are given both an algorithm for approximate planning and the graphical structure (but not the parameters) of the DBN. Unlike the original E 3 algorithm, our new algorithm exploits the DBN structure to achieve a running time that scales polynomially in the number of parameters of the DBN, which may be exponentially smaller than the number of global states.

CiteSeerX

Feature Dynamic Bayesian Networks

Author: Hutter Marcus
Publication venue
Publication date: 24/12/2008
Field of study

Feature Markov Decision Processes (PhiMDPs) are well-suited for learning agents in general environments. Nevertheless, unstructured (Phi)MDPs are limited to relatively simple environments. Structured MDPs like Dynamic Bayesian Networks (DBNs) are used for large-scale real-world problems. In this article I extend PhiMDP to PhiDBN. The primary contribution is to derive a cost criterion that allows to automatically extract the most relevant features from the environment, leading to the "best" DBN representation. I discuss all building blocks required for a complete general learning algorithm.Comment: 7 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University