965 research outputs found
Markov Decision Processes with Risk-Sensitive Criteria: An Overview
The paper provides an overview of the theory and applications of
risk-sensitive Markov decision processes. The term 'risk-sensitive' refers here
to the use of the Optimized Certainty Equivalent as a means to measure
expectation and risk. This comprises the well-known entropic risk measure and
Conditional Value-at-Risk. We restrict our considerations to stationary
problems with an infinite time horizon. Conditions are given under which
optimal policies exist and solution procedures are explained. We present both
the theory when the Optimized Certainty Equivalent is applied recursively as
well as the case where it is applied to the cumulated reward. Discounted as
well as non-discounted models are reviewe
Model and Reinforcement Learning for Markov Games with Risk Preferences
We motivate and propose a new model for non-cooperative Markov game which
considers the interactions of risk-aware players. This model characterizes the
time-consistent dynamic "risk" from both stochastic state transitions (inherent
to the game) and randomized mixed strategies (due to all other players). An
appropriate risk-aware equilibrium concept is proposed and the existence of
such equilibria is demonstrated in stationary strategies by an application of
Kakutani's fixed point theorem. We further propose a simulation-based
Q-learning type algorithm for risk-aware equilibrium computation. This
algorithm works with a special form of minimax risk measures which can
naturally be written as saddle-point stochastic optimization problems, and
covers many widely investigated risk measures. Finally, the almost sure
convergence of this simulation-based algorithm to an equilibrium is
demonstrated under some mild conditions. Our numerical experiments on a two
player queuing game validate the properties of our model and algorithm, and
demonstrate their worth and applicability in real life competitive
decision-making.Comment: 38 pages, 6 tables, 5 figure
Minimizing Spectral Risk Measures Applied to Markov Decision Processes
We study the minimization of a spectral risk measure of the total discounted
cost generated by a Markov Decision Process (MDP) over a finite or infinite
planning horizon. The MDP is assumed to have Borel state and action spaces and
the cost function may be unbounded above. The optimization problem is split
into two minimization problems using an infimum representation for spectral
risk measures. We show that the inner minimization problem can be solved as an
ordinary MDP on an extended state space and give sufficient conditions under
which an optimal policy exists. Regarding the infinite dimensional outer
minimization problem, we prove the existence of a solution and derive an
algorithm for its numerical approximation. Our results include the findings in
B\"auerle and Ott (2011) in the special case that the risk measure is Expected
Shortfall. As an application, we present a dynamic extension of the classical
static optimal reinsurance problem, where an insurance company minimizes its
cost of capital
- …