Search CORE

191 research outputs found

A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 06/12/2015
Field of study

We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Nonzero-sum Stochastic Games

Author: Nowak Andrzej S.
Szajowski Krzysztof
Publication venue
Publication date
Field of study

This paper treats of stochastic games. We focus on nonzero-sum games and provide a detailed survey of selected recent results. In Section 1, we consider stochastic Markov games. A correlation of strategies of the players, involving ``public signals'', is described, and a correlated equilibrium theorem proved recently by Nowak and Raghavan for discounted stochastic games with general state space is presented. We also report an extension of this result to a class of undiscounted stochastic games, satisfying some uniform ergodicity condition. Stopping games are related to stochastic Markov games. In Section 2, we describe a version of Dynkin's game related to observation of a Markov process with random assignment mechanism of states to the players. Some recent contributions of the second author in this area are reported. The paper also contains a brief overview of the theory of nonzero-sum stochastic games and stopping games which is very far from being complete.average payoff stochastic games, correlated stationary equilibria, nonzero-sum games, stopping time, stopping games

Research Papers in Economics

Discrete-Time Control with Non-Constant Discount Factor

Author: Jasso-Fuentes Héctor
Menaldi José-Luis
Prieto-Rumeau Tomás
Publication venue: DigitalCommons@WayneState
Publication date: 27/06/2020
Field of study

This paper deals with discrete-time Markov decision processes (MDPs) with Borel state and action spaces, and total expected discounted cost optimality criterion. We assume that the discount factor is not constant: it may depend on the state and action; moreover, it can even take the extreme values zero or one. We propose sufficient conditions on the data of the model ensuring the existence of optimal control policies and allowing the characterization of the optimal value function as a solution to the dynamic programming equation. As a particular case of these MDPs with varying discount factor, we study MDPs with stopping, as well as the corresponding optimal stopping times and contact set. We show applications to switching MDPs models and, in particular, we study a pollution accumulation problem

Digital Commons@Wayne State University

A Relative Value Iteration Algorithm for Non-degenerate Controlled Diffusions

Author: Ari Arapostathis
Stannat W.
Vivek S. Borkar
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 06/10/2011
Field of study

The ergodic control problem for a non-degenerate controlled diffusion controlled through its drift is considered under a uniform stability condition that ensures the well-posedness of the associated Hamilton-Jacobi-Bellman (HJB) equation. A nonlinear parabolic evolution equation is then proposed as a continuous time continuous state space analog of White's `relative value iteration' algorithm for solving the ergodic dynamic programming equation for the finite state finite action case. Its convergence to the solution of the HJB equation is established using the theory of monotone dynamical systems and also, alternatively, by using the theory of reverse martingales.Comment: 17 page

arXiv.org e-Print Archive

Crossref

Dspace at IIT Bombay

Discrete-time controlled markov processes with average cost criterion: a survey

Author: Arapostathis Aristotle
Borkar Vivek S.
Fernandez-Gaucherand Emmanuel
Ghosh Mrinal K.
Marcus Steven I.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/03/1993
Field of study

This work is a survey of the average cost control problem for discrete-time Markov processes. The authors have attempted to put together a comprehensive account of the considerable research on this problem over the past three decades. The exposition ranges from finite to Borel state and action spaces and includes a variety of methodologies to find and characterize optimal policies. The authors have included a brief historical perspective of the research efforts in this area and have compiled a substantial yet not exhaustive bibliography. The authors have also identified several important questions that are still open to investigation

A note on an LQG regulator with Markovian switching and pathwise average cost

Author: A. Arapostathis
M.K. Ghosh
S.I. Marcus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref