12,022 research outputs found
Average optimality for continuous-time Markov decision processes in polish spaces
This paper is devoted to studying the average optimality in continuous-time
Markov decision processes with fairly general state and action spaces. The
criterion to be maximized is expected average rewards. The transition rates of
underlying continuous-time jump Markov processes are allowed to be unbounded,
and the reward rates may have neither upper nor lower bounds. We first provide
two optimality inequalities with opposed directions, and also give suitable
conditions under which the existence of solutions to the two optimality
inequalities is ensured. Then, from the two optimality inequalities we prove
the existence of optimal (deterministic) stationary policies by using the
Dynkin formula. Moreover, we present a ``semimartingale characterization'' of
an optimal stationary policy. Finally, we use a generalized Potlach process
with control to illustrate the difference between our conditions and those in
the previous literature, and then further apply our results to average optimal
control problems of generalized birth--death systems, upwardly skip-free
processes and two queueing systems. The approach developed in this paper is
slightly different from the ``optimality inequality approach'' widely used in
the previous literature.Comment: Published at http://dx.doi.org/10.1214/105051606000000105 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Average Continuous Control of Piecewise Deterministic Markov Processes
This paper deals with the long run average continuous control problem of
piecewise deterministic Markov processes (PDMP's) taking values in a general
Borel space and with compact action space depending on the state variable. The
control variable acts on the jump rate and transition measure of the PDMP, and
the running and boundary costs are assumed to be positive but not necessarily
bounded. Our first main result is to obtain an optimality equation for the long
run average cost in terms of a discrete-time optimality equation related to the
embedded Markov chain given by the post-jump location of the PDMP. Our second
main result guarantees the existence of a feedback measurable selector for the
discrete-time optimality equation by establishing a connection between this
equation and an integro-differential equation. Our final main result is to
obtain some sufficient conditions for the existence of a solution for a
discrete-time optimality inequality and an ordinary optimal feedback control
for the long run average cost using the so-called vanishing discount approach.Comment: 34 page
Average optimality for continuous-time Markov decision processes under weak continuity conditions
This article considers the average optimality for a continuous-time Markov
decision process with Borel state and action spaces and an arbitrarily
unbounded nonnegative cost rate. The existence of a deterministic stationary
optimal policy is proved under a different and general set of conditions as
compared to the previous literature; the controlled process can be explosive,
the transition rates can be arbitrarily unbounded and are weakly continuous,
the multifunction defining the admissible action spaces can be neither
compact-valued nor upper semi-continuous, and the cost rate is not necessarily
inf-compact
- …