40 research outputs found
Markov decision with unknown transition law : the discounted case
In this paper we consider some problems and results in the field of Markov decision processes with an incompletely known transition law. We consider the discounted total return under the Bayes criterion. We discuss easy-to-handle strategies which are optimal under some conditions for the average return case and also for some special models in the discounted total return case. Further we provide approximation methods to compute the optimal value
Markov strategies in dynamic programming
It will be proven that the supremum of the expected total return over the Markov strategies equals the supremum over all strategies. The model assumptions are: the state space is countable, the action space is measurable and the supremum of the expected total of the positive rewards over the Markov strategies is finite