Search CORE

131 research outputs found

Discounted continuous-time constrained Markov decision processes in Polish spaces

Author: Guo Xianping
Song Xinyuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/12/2011
Field of study

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called

\bar{w}

-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Average optimality for continuous-time Markov decision processes in polish spaces

Author: Guo Xianping
Rieder Ulrich
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 04/07/2006
Field of study

This paper is devoted to studying the average optimality in continuous-time Markov decision processes with fairly general state and action spaces. The criterion to be maximized is expected average rewards. The transition rates of underlying continuous-time jump Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We first provide two optimality inequalities with opposed directions, and also give suitable conditions under which the existence of solutions to the two optimality inequalities is ensured. Then, from the two optimality inequalities we prove the existence of optimal (deterministic) stationary policies by using the Dynkin formula. Moreover, we present a ``semimartingale characterization'' of an optimal stationary policy. Finally, we use a generalized Potlach process with control to illustrate the difference between our conditions and those in the previous literature, and then further apply our results to average optimal control problems of generalized birth--death systems, upwardly skip-free processes and two queueing systems. The approach developed in this paper is slightly different from the ``optimality inequality approach'' widely used in the previous literature.Comment: Published at http://dx.doi.org/10.1214/105051606000000105 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Maximal reliability of controlled Markov systems

Author: Guo Xianping
Guo Xin
Li Yanyun
Publication venue
Publication date: 11/08/2023
Field of study

This paper concentrates on the reliability of a discrete-time controlled Markov system with finite states and actions, and aims to give an efficient algorithm for obtaining an optimal (control) policy that makes the system have the maximal reliability for every initial state. After establishing the existence of an optimal policy, for the computation of optimal policies, we introduce the concept of an absorbing set of a stationary policy, and find some characterization and a computational method of the absorbing sets. Using the largest absorbing set, we build a novel optimality equation (OE), and prove the uniqueness of a solution of the OE. Furthermore, we provide a policy iteration algorithm of optimal policies, and prove that an optimal policy and the maximal reliability can be obtained in a finite number of iterations. Finally, an example in reliability and maintenance problems is given to illustrate our results

arXiv.org e-Print Archive

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Author: Guo Xianping
Huang Yonghui
Zhang Yi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/04/2016
Field of study

University of Liverpool Repository

Crossref

University of Birmingham Research Portal

ON THE FIRST PASSAGE g-MEAN-VARIANCE OPTIMALITY FOR DISCOUNTED CONTINUOUS-TIME MARKOV DECISION PROCESSES

Author: Guo Xianping
Huang Xiangxiang
Zhang Yi
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2015
Field of study

University of Liverpool Repository

Crossref

University of Birmingham Research Portal