Search CORE

7,591 research outputs found

Discounted continuous-time constrained Markov decision processes in Polish spaces

Author: Guo Xianping
Song Xinyuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/12/2011
Field of study

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called

\bar{w}

-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Percentile Queries in Multi-Dimensional Markov Decision Processes

Author: C Baier
C Haase
C Wu
DJ White
DP Bertsekas
JA Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
K Etessami
L Alfaro de
M Randour
M Sakaguchi
M Ummels
Michael R Garey
ML Puterman
O Goldreich
S Toda
SD Travers
T Brázdil
U Boker
Y Ohtsubo
Publication venue
Publication date: 01/01/2015
Field of study

Markov decision processes (MDPs) with multi-dimensional weights are useful to analyze systems with multiple objectives that may be conflicting and require the analysis of trade-offs. We study the complexity of percentile queries in such MDPs and give algorithms to synthesize strategies that enforce such constraints. Given a multi-dimensional weighted MDP and a quantitative payoff function

f

, thresholds

v_i

(one per dimension), and probability thresholds

\alpha_i

, we show how to compute a single strategy to enforce that for all dimensions

i

, the probability of outcomes

\rho

satisfying

f_i(\rho) \geq v_i

is at least

\alpha_i

. We consider classical quantitative payoffs from the literature (sup, inf, lim sup, lim inf, mean-payoff, truncated sum, discounted sum). Our work extends to the quantitative case the multi-objective model checking problem studied by Etessami et al. in unweighted MDPs.Comment: Extended version of CAV 2015 pape

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

DI-fusion

Hal-Diderot

HAL-Rennes 1

Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes

Author: Ashok Pranav
Chatterjee Krishnendu
Forejt Vojtech
Publication venue
Publication date: 08/05/2018
Field of study

We present the conditional value-at-risk (CVaR) in the context of Markov chains and Markov decision processes with reachability and mean-payoff objectives. CVaR quantifies risk by means of the expectation of the worst p-quantile. As such it can be used to design risk-averse systems. We consider not only CVaR constraints, but also introduce their conjunction with expectation constraints and quantile constraints (value-at-risk, VaR). We derive lower and upper bounds on the computational complexity of the respective decision problems and characterize the structure of the strategies in terms of memory and randomization

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints