Search CORE

1 research outputs found

Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Author: Stephan Pareigis
Publication venue: The MIT Press
Publication date: 01/01/1996
Field of study

Reinforcement learning methods for discrete and semi-Markov decision problems such as Real-Time Dynamic Programming can be generalized for Controlled Diffusion Processes. The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic differential equation of HamiltonJacobi -Bellman (HJB-) type. Numerical analysis provides multigrid methods for this kind of equation. In the case of Learning Control, however, the systems of equations on the various grid-levels are obtained using observed information (transitions and local cost). To ensure consistency, special attention needs to be directed toward the type of time and space discretization during the observation. An algorithm for multi-grid observation is proposed. The multi-grid algorithm is demonstrated on a simple queuing problem

CiteSeerX