Improved successive approximation methods for discounted Markov decision processes

Abstract

Successive Approximation (S.A.) methods, for solving discounted Markov decision problems, have been developed to avoid the extensive computations that are connected with linear programming and policy iteration techniques for solving large scaled problems. Several authors give such an S.A. algorithm. In this paper we introduce some new algorithms while furthermore it will be shown how the several S.A. algorithms may be combined. For each algorithm converging sequences of upper and lower bounds for the optimal value will be given

    Similar works