1,499 research outputs found
From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming
We consider linear programming (LP) problems in infinite dimensional spaces
that are in general computationally intractable. Under suitable assumptions, we
develop an approximation bridge from the infinite-dimensional LP to tractable
finite convex programs in which the performance of the approximation is
quantified explicitly. To this end, we adopt the recent developments in two
areas of randomized optimization and first order methods, leading to a priori
as well as a posterior performance guarantees. We illustrate the generality and
implications of our theoretical results in the special case of the long-run
average cost and discounted cost optimal control problems for Markov decision
processes on Borel spaces. The applicability of the theoretical results is
demonstrated through a constrained linear quadratic optimal control problem and
a fisheries management problem.Comment: 30 pages, 5 figure
Average optimality for continuous-time Markov decision processes under weak continuity conditions
This article considers the average optimality for a continuous-time Markov
decision process with Borel state and action spaces and an arbitrarily
unbounded nonnegative cost rate. The existence of a deterministic stationary
optimal policy is proved under a different and general set of conditions as
compared to the previous literature; the controlled process can be explosive,
the transition rates can be arbitrarily unbounded and are weakly continuous,
the multifunction defining the admissible action spaces can be neither
compact-valued nor upper semi-continuous, and the cost rate is not necessarily
inf-compact
Discounted continuous-time constrained Markov decision processes in Polish spaces
This paper is devoted to studying constrained continuous-time Markov decision
processes (MDPs) in the class of randomized policies depending on state
histories. The transition rates may be unbounded, the reward and costs are
admitted to be unbounded from above and from below, and the state and action
spaces are Polish spaces. The optimality criterion to be maximized is the
expected discounted rewards, and the constraints can be imposed on the expected
discounted costs. First, we give conditions for the nonexplosion of underlying
processes and the finiteness of the expected discounted rewards/costs. Second,
using a technique of occupation measures, we prove that the constrained
optimality of continuous-time MDPs can be transformed to an equivalent
(optimality) problem over a class of probability measures. Based on the
equivalent problem and a so-called -weak convergence of probability
measures developed in this paper, we show the existence of a constrained
optimal policy. Third, by providing a linear programming formulation of the
equivalent problem, we show the solvability of constrained optimal policies.
Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …