Institute of Information Theory and Automation AS CR
Abstract
summary:The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process {xt} and the discount process {αt} evolve according to the coupled difference equations xt+1=F(xt,αt,at,ξt),αt+1=G(αt,ηt) where the state and discount disturbance processes {ξt} and {ηt} are sequences of i.i.d. random variables with densities ρξ and ρη respectively. The main objective is to introduce approximation algorithms of the optimal cost function that lead up to construction of optimal or nearly optimal policies in the cases when the densities ρξ and ρη are either known or unknown. In the latter case, we combine suitable estimation methods with control procedures to construct an asymptotically discounted optimal policy