Search CORE

1 research outputs found

Partial-Information State-Based Optimization of Partially Observable Markov Decision Processes and the Separation Principle

Author: Cao Xi-Ren
Qiu Li
Wang De-Xin
Publication venue
Publication date: 01/01/2014
Field of study

We propose a partial-information state based approach to the optimization of the long-run average performance in a partially observable Markov decision process (POMDP). In this approach, the information history is summarized (at least partially) by a (or a few) statistic(s), not necessary sufficient, called a partial-information state, and actions depend on the partial-information state, rather than system states. We first propose the "single-policy based comparison principle," under which we derive an HJB-type of optimality equation and policy iteration for the optimal policy in the partial-information-state based policy space. We then introduce the Q-sufficient statistics and show that if the partial-information state is Q-sufficient, then the optimal policy in the partial-information state based policy space is optimal in the space of all feasible information state based policies. We show that with some further conditions the well-known separation principle holds. The results are obtained by applying the direct comparison based approach initially developed for discrete event dynamic systems

Hong Kong University of Science and Technology Institutional Repository