51 research outputs found
Average optimality for continuous-time Markov decision processes in polish spaces
This paper is devoted to studying the average optimality in continuous-time
Markov decision processes with fairly general state and action spaces. The
criterion to be maximized is expected average rewards. The transition rates of
underlying continuous-time jump Markov processes are allowed to be unbounded,
and the reward rates may have neither upper nor lower bounds. We first provide
two optimality inequalities with opposed directions, and also give suitable
conditions under which the existence of solutions to the two optimality
inequalities is ensured. Then, from the two optimality inequalities we prove
the existence of optimal (deterministic) stationary policies by using the
Dynkin formula. Moreover, we present a ``semimartingale characterization'' of
an optimal stationary policy. Finally, we use a generalized Potlach process
with control to illustrate the difference between our conditions and those in
the previous literature, and then further apply our results to average optimal
control problems of generalized birth--death systems, upwardly skip-free
processes and two queueing systems. The approach developed in this paper is
slightly different from the ``optimality inequality approach'' widely used in
the previous literature.Comment: Published at http://dx.doi.org/10.1214/105051606000000105 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Discrete-time controlled markov processes with average cost criterion: a survey
This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation
Optimality of mixed policies for average continuous-time Markov decision processes with constraints
This article concerns the average criteria for continuous-time Markov decision processes with N constraints. We show the following; (a) every extreme point of the space of performance vectors corresponding to the set of stable measures is generated by a deterministic stationary policy; and (b) there exists a mixed optimal policy, where the mixture is over no more than N + 1 deterministic stationary policies
A discounted model for a repairable system with continuous state space
We examine repairable systems with a continous state space and partial repair options, carried out at fixed times . Every time interval there is a manufacturing cost and a repair cost. These cost functions are not restricted to the class of bounded functions in this study. Conditions are found under which a control-limit replacement policy minimizes the discounted cost. Hence these conditions guarantee that there is an optimal policy under the discounted cost criterion which does not use partial repairs. We explicitly explain how to derive this optimal policy
Continuous-Time Markov Decision Processes with Exponential Utility
In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk-sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We essentially do not need to impose a condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive
- …