19,267 research outputs found
Continuous-time Markov decision processes under the risk-sensitive average cost criterion
This paper studies continuous-time Markov decision processes under the
risk-sensitive average cost criterion. The state space is a finite set, the
action space is a Borel space, the cost and transition rates are bounded, and
the risk-sensitivity coefficient can take arbitrary positive real numbers.
Under the mild conditions, we develop a new approach to establish the existence
of a solution to the risk-sensitive average cost optimality equation and obtain
the existence of an optimal deterministic stationary policy.Comment: 14 page
Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk
In this paper we present an algorithm to compute risk averse policies in
Markov Decision Processes (MDP) when the total cost criterion is used together
with the average value at risk (AVaR) metric. Risk averse policies are needed
when large deviations from the expected behavior may have detrimental effects,
and conventional MDP algorithms usually ignore this aspect. We provide
conditions for the structure of the underlying MDP ensuring that approximations
for the exact problem can be derived and solved efficiently. Our findings are
novel inasmuch as average value at risk has not previously been considered in
association with the total cost criterion. Our method is demonstrated in a
rapid deployment scenario, whereby a robot is tasked with the objective of
reaching a target location within a temporal deadline where increased speed is
associated with increased probability of failure. We demonstrate that the
proposed algorithm not only produces a risk averse policy reducing the
probability of exceeding the expected temporal deadline, but also provides the
statistical distribution of costs, thus offering a valuable analysis tool
On gradual-impulse control of continuous-time Markov decision processes with multiplicative cost
In this paper, we consider the gradual-impulse control problem of
continuous-time Markov decision processes, where the system performance is
measured by the expectation of the exponential utility of the total cost. We
prove, under very general conditions on the system primitives, the existence of
a deterministic stationary optimal policy out of a more general class of
policies. Policies that we consider allow multiple simultaneous impulses,
randomized selection of impulses with random effects, relaxed gradual controls,
and accumulation of jumps. After characterizing the value function using the
optimality equation, we reduce the continuous-time gradual-impulse control
problem to an equivalent simple discrete-time Markov decision process, whose
action space is the union of the sets of gradual and impulsive actions
- …