51 research outputs found

    Average optimality for continuous-time Markov decision processes in polish spaces

    Full text link
    This paper is devoted to studying the average optimality in continuous-time Markov decision processes with fairly general state and action spaces. The criterion to be maximized is expected average rewards. The transition rates of underlying continuous-time jump Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We first provide two optimality inequalities with opposed directions, and also give suitable conditions under which the existence of solutions to the two optimality inequalities is ensured. Then, from the two optimality inequalities we prove the existence of optimal (deterministic) stationary policies by using the Dynkin formula. Moreover, we present a ``semimartingale characterization'' of an optimal stationary policy. Finally, we use a generalized Potlach process with control to illustrate the difference between our conditions and those in the previous literature, and then further apply our results to average optimal control problems of generalized birth--death systems, upwardly skip-free processes and two queueing systems. The approach developed in this paper is slightly different from the ``optimality inequality approach'' widely used in the previous literature.Comment: Published at http://dx.doi.org/10.1214/105051606000000105 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Discrete-time controlled markov processes with average cost criterion: a survey

    Get PDF
    This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation

    Optimality of mixed policies for average continuous-time Markov decision processes with constraints

    Get PDF
    This article concerns the average criteria for continuous-time Markov decision processes with N constraints. We show the following; (a) every extreme point of the space of performance vectors corresponding to the set of stable measures is generated by a deterministic stationary policy; and (b) there exists a mixed optimal policy, where the mixture is over no more than N + 1 deterministic stationary policies

    A discounted model for a repairable system with continuous state space

    Get PDF
    We examine repairable systems with a continous state space and partial repair options, carried out at fixed times n=1,2,...n=1,2,.... Every time interval [n,n+1)[n,n+1) there is a manufacturing cost and a repair cost. These cost functions are not restricted to the class of bounded functions in this study. Conditions are found under which a control-limit replacement policy minimizes the discounted cost. Hence these conditions guarantee that there is an optimal policy under the discounted cost criterion which does not use partial repairs. We explicitly explain how to derive this optimal policy

    Continuous-Time Markov Decision Processes with Exponential Utility

    Get PDF
    In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk-sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We essentially do not need to impose a condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive
    • …
    corecore