13 research outputs found
Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk
In this paper we present an algorithm to compute risk averse policies in
Markov Decision Processes (MDP) when the total cost criterion is used together
with the average value at risk (AVaR) metric. Risk averse policies are needed
when large deviations from the expected behavior may have detrimental effects,
and conventional MDP algorithms usually ignore this aspect. We provide
conditions for the structure of the underlying MDP ensuring that approximations
for the exact problem can be derived and solved efficiently. Our findings are
novel inasmuch as average value at risk has not previously been considered in
association with the total cost criterion. Our method is demonstrated in a
rapid deployment scenario, whereby a robot is tasked with the objective of
reaching a target location within a temporal deadline where increased speed is
associated with increased probability of failure. We demonstrate that the
proposed algorithm not only produces a risk averse policy reducing the
probability of exceeding the expected temporal deadline, but also provides the
statistical distribution of costs, thus offering a valuable analysis tool
On gradual-impulse control of continuous-time Markov decision processes with multiplicative cost
In this paper, we consider the gradual-impulse control problem of
continuous-time Markov decision processes, where the system performance is
measured by the expectation of the exponential utility of the total cost. We
prove, under very general conditions on the system primitives, the existence of
a deterministic stationary optimal policy out of a more general class of
policies. Policies that we consider allow multiple simultaneous impulses,
randomized selection of impulses with random effects, relaxed gradual controls,
and accumulation of jumps. After characterizing the value function using the
optimality equation, we reduce the continuous-time gradual-impulse control
problem to an equivalent simple discrete-time Markov decision process, whose
action space is the union of the sets of gradual and impulsive actions
On Optimizing the Conditional Value-at-Risk of a Maximum Cost for Risk-Averse Safety Analysis
The popularity of Conditional Value-at-Risk (CVaR), a risk functional from
finance, has been growing in the control systems community due to its intuitive
interpretation and axiomatic foundation. We consider a non-standard optimal
control problem in which the goal is to minimize the CVaR of a maximum random
cost subject to a Borel-space Markov decision process. The objective takes the
form , where is a
risk-aversion parameter representing a fraction of worst cases, is a
stage or terminal cost, and is the length of a finite
discrete-time horizon. The objective represents the maximum departure from a
desired operating region averaged over a given fraction of worst
cases. This problem provides a safety criterion for a stochastic system that is
informed by both the probability and severity of the potential consequences of
the system's trajectory. In contrast, existing safety analysis frameworks apply
stage-wise risk constraints (i.e., must be small for all , where
is a risk functional) or assess the probability of constraint violation
without quantifying its possible severity. To the best of our knowledge, the
problem of interest has not been solved. To solve the problem, we propose and
study a family of stochastic dynamic programs on an augmented state space. We
prove that the optimal CVaR of a maximum cost enjoys an equivalent
representation in terms of the solutions to this family of dynamic programs
under appropriate assumptions. We show the existence of an optimal policy that
depends on the dynamics of an augmented state under a measurable selection
condition. Moreover, we demonstrate how our safety analysis framework is useful
for assessing the severity of combined sewer overflows under precipitation
uncertainty.Comment: A shorter version is under review for IEEE Transactions on Automatic
Control, submitted December 202