The Complexity of POMDPs with Long-run Average Objectives

Chatterjee, Krishnendu; Saona, Raimundo; Ziliotto, Bruno

research

The Complexity of POMDPs with Long-run Average Objectives

Authors: Krishnendu Chatterjee
Raimundo Saona
Bruno Ziliotto
Publication date: 30 April 2019
Publisher
Doi

Abstract

We study the problem of approximation of optimal values in partially-observable Markov decision processes (POMDPs) with long-run average objectives. POMDPs are a standard model for dynamic systems with probabilistic and nondeterministic behavior in uncertain environments. In long-run average objectives rewards are associated with every transition of the POMDP and the payoff is the long-run average of the rewards along the executions of the POMDP. We establish strategy complexity and computational complexity results. Our main result shows that finite-memory strategies suffice for approximation of optimal values, and the related decision problem is recursively enumerable complete

Similar works

Full text

Available Versions

IST Austria: PubRep (Institute of Science and Technology)

oai:pub.research-explorer.app....

Last time updated on 15/04/2021

arXiv.org e-Print Archive

oai:arXiv.org:1904.13360

Last time updated on 02/06/2019