Search CORE

545 research outputs found

Shape-constrained Estimation of Value Functions

Author: Glynn Peter W.
Mousavi Mohammad
Publication venue
Publication date: 01/01/2013
Field of study

We present a fully nonparametric method to estimate the value function, via simulation, in the context of expected infinite-horizon discounted rewards for Markov chains. Estimating such value functions plays an important role in approximate dynamic programming and applied probability in general. We incorporate "soft information" into the estimation algorithm, such as knowledge of convexity, monotonicity, or Lipchitz constants. In the presence of such information, a nonparametric estimator for the value function can be computed that is provably consistent as the simulated time horizon tends to infinity. As an application, we implement our method on price tolling agreement contracts in energy markets

arXiv.org e-Print Archive

CiteSeerX

A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 06/12/2015
Field of study

We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Discrete-time controlled markov processes with average cost criterion: a survey

Author: Arapostathis Aristotle
Borkar Vivek S.
Fernandez-Gaucherand Emmanuel
Ghosh Mrinal K.
Marcus Steven I.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/03/1993
Field of study

This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation

Multi-Automata Learning

Author: Nowe Ann
Peeters Maarten
Verbeeck Katja
Vrancx Peter
Publication venue: 'IntechOpen'
Publication date: 01/01/2008
Field of study

IntechOpen

An approximation approach for the deviation matrix of continuous-time Markov processes with application to Markov decision theory

Author: Arie Hordijk
Asmussen S.
Bernd Heidergott
Bertsekas D.
Coolen-Schrijner P.
Guo X.
Heidergott B.
Kijima M.
Neuts M.
Nicole Leder
Olsson M.
Riska A.
Ross S.
Tijms H.
Publication venue
Publication date: 01/01/2010
Field of study

We present an update formula that allows the expression of the deviation matrix of a continuous-time Markov process with denumerable state space having generator matrix Q* through a continuous-time Markov process with generator matrix Q. We show that under suitable stability conditions the algorithm converges at a geometric rate. By applying the concept to three different examples, namely, the M/M/1 queue with vacations, the M/G/1 queue, and a tandem network, we illustrate the broad applicability of our approach. For a problem in admission control, we apply our approximation algorithm toMarkov decision theory for computing the optimal control policy. Numerical examples are presented to highlight the efficiency of the proposed algorithm. © 2010 INFORMS

Crossref

VU Research Portal