7,591 research outputs found
Discounted continuous-time constrained Markov decision processes in Polish spaces
This paper is devoted to studying constrained continuous-time Markov decision
processes (MDPs) in the class of randomized policies depending on state
histories. The transition rates may be unbounded, the reward and costs are
admitted to be unbounded from above and from below, and the state and action
spaces are Polish spaces. The optimality criterion to be maximized is the
expected discounted rewards, and the constraints can be imposed on the expected
discounted costs. First, we give conditions for the nonexplosion of underlying
processes and the finiteness of the expected discounted rewards/costs. Second,
using a technique of occupation measures, we prove that the constrained
optimality of continuous-time MDPs can be transformed to an equivalent
(optimality) problem over a class of probability measures. Based on the
equivalent problem and a so-called -weak convergence of probability
measures developed in this paper, we show the existence of a constrained
optimal policy. Third, by providing a linear programming formulation of the
equivalent problem, we show the solvability of constrained optimal policies.
Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Percentile Queries in Multi-Dimensional Markov Decision Processes
Markov decision processes (MDPs) with multi-dimensional weights are useful to
analyze systems with multiple objectives that may be conflicting and require
the analysis of trade-offs. We study the complexity of percentile queries in
such MDPs and give algorithms to synthesize strategies that enforce such
constraints. Given a multi-dimensional weighted MDP and a quantitative payoff
function , thresholds (one per dimension), and probability thresholds
, we show how to compute a single strategy to enforce that for all
dimensions , the probability of outcomes satisfying is at least . We consider classical quantitative payoffs from
the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum,
discounted sum). Our work extends to the quantitative case the multi-objective
model checking problem studied by Etessami et al. in unweighted MDPs.Comment: Extended version of CAV 2015 pape
Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes
We present the conditional value-at-risk (CVaR) in the context of Markov
chains and Markov decision processes with reachability and mean-payoff
objectives. CVaR quantifies risk by means of the expectation of the worst
p-quantile. As such it can be used to design risk-averse systems. We consider
not only CVaR constraints, but also introduce their conjunction with
expectation constraints and quantile constraints (value-at-risk, VaR). We
derive lower and upper bounds on the computational complexity of the respective
decision problems and characterize the structure of the strategies in terms of
memory and randomization
- âŠ