5 research outputs found
Persistently optimal policies in stochastic dynamic programming with generalized discounting
In this paper we study a Markov decision process with a non-linear discount function. Our approach is in spirit of the von Neumann-Morgenstern concept and is based on the notion of expectation. First, we define a utility on the space of trajectories of the process in the finite and infinite time horizon and then take their expected values. It turns out that the associated optimization problem leads to a non-stationary dynamic programming and an infinite system of Bellman equations, which result in obtaining persistently optimal policies. Our theory is enriched by examples.Stochastic dynamic programming, Persistently optimal policies, Variable discounting, Bellman equation, Resource extraction, Growth theory
Persistently optimal policies in stochastic dynamic programming with generalized discounting
In this paper we study a Markov decision process with
a non-linear discount function.
Our approach is
in spirit of the von Neumann-Morgenstern concept and is
based on the notion of expectation. First, we define
a utility on the space of trajectories of the process in the finite and infinite time horizon and then take their expected values. It turns out that the associated optimization problem leads to a non-stationary dynamic programming and an infinite system of Bellman equations, which result in obtaining persistently optimal policies.
Our theory is enriched by examples
Stochastic dynamic programming with non-linear discounting
In this paper, we study a Markov decision process with a non-linear discount
function and with a Borel state space. We define a recursive discounted
utility, which resembles non-additive utility functions considered in a number
of models in economics. Non-additivity here follows from non-linearity of the
discount function. Our study is complementary to the work of Ja\'skiewicz,
Matkowski and Nowak (Math. Oper. Res. 38 (2013), 108-121), where also
non-linear discounting is used in the stochastic setting, but the expectation
of utilities aggregated on the space of all histories of the process is applied
leading to a non-stationary dynamic programming model. Our aim is to prove that
in the recursive discounted utility case the Bellman equation has a solution
and there exists an optimal stationary policy for the problem in the infinite
time horizon. Our approach includes two cases: when the one-stage utility
is bounded on both sides by a weight function multiplied by some positive and
negative constants, and when the one-stage utility is unbounded from
below