Solving Partially Observable Markov Decision Processes by Optimization Neural Networks

R. Alquézar; S.A. Velázquez-Lerma

Solving Partially Observable Markov Decision Processes by Optimization Neural Networks

Authors: R. Alquézar
S.A. Velázquez-Lerma
Publication date
Publisher

Abstract

Partially Observable Markov Decision Processes (POMDPs) cope with sequential decision processes where an agent tries to maximize some reward without complete knowledge of the process. These models are of interest for quality control, machine maintenance, reinforcement learning, etc. More generally, Monahan [9] has shown that many tasks in partially observable environments can be viewed as POMDPs. A solution for the POMDP gives the best behavior of the agent face to the environment. This gives a solution over all the state space, which is continuous and inside of an integral polytope. The approaches proposed until now use linear programming (LP) to solve the optimization problem in this type of processes. By other side, Neural Networks (NNs) have shown a promising potentiality for finding solutions to optimization problems; particularly, they have been used to solve quadratic 0-1 programming problems [4, 6]. In this paper, we use optimization neural networks as a different way to solve the optimization problem in the POMDP, which allows a parallel hardware implementation

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.57.66...

Last time updated on 22/10/2014