133 research outputs found

    Discounted continuous-time constrained Markov decision processes in Polish spaces

    Full text link
    This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called wˉ\bar{w}-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Average optimality for continuous-time Markov decision processes in polish spaces

    Full text link
    This paper is devoted to studying the average optimality in continuous-time Markov decision processes with fairly general state and action spaces. The criterion to be maximized is expected average rewards. The transition rates of underlying continuous-time jump Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We first provide two optimality inequalities with opposed directions, and also give suitable conditions under which the existence of solutions to the two optimality inequalities is ensured. Then, from the two optimality inequalities we prove the existence of optimal (deterministic) stationary policies by using the Dynkin formula. Moreover, we present a ``semimartingale characterization'' of an optimal stationary policy. Finally, we use a generalized Potlach process with control to illustrate the difference between our conditions and those in the previous literature, and then further apply our results to average optimal control problems of generalized birth--death systems, upwardly skip-free processes and two queueing systems. The approach developed in this paper is slightly different from the ``optimality inequality approach'' widely used in the previous literature.Comment: Published at http://dx.doi.org/10.1214/105051606000000105 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Maximal reliability of controlled Markov systems

    Full text link
    This paper concentrates on the reliability of a discrete-time controlled Markov system with finite states and actions, and aims to give an efficient algorithm for obtaining an optimal (control) policy that makes the system have the maximal reliability for every initial state. After establishing the existence of an optimal policy, for the computation of optimal policies, we introduce the concept of an absorbing set of a stationary policy, and find some characterization and a computational method of the absorbing sets. Using the largest absorbing set, we build a novel optimality equation (OE), and prove the uniqueness of a solution of the OE. Furthermore, we provide a policy iteration algorithm of optimal policies, and prove that an optimal policy and the maximal reliability can be obtained in a finite number of iterations. Finally, an example in reliability and maintenance problems is given to illustrate our results
    corecore