Search CORE

355 research outputs found

Decomposition and parallel processing techniques for two-time scale controlled Markov chains

Author: Filar Jerzy A
Gondzio Jacek
Haurie Alain
Moresino Francesco
Vial Jean-Philippe
Publication venue: Institute of Electrical and Electronic Engineers
Publication date: 01/01/2000
Field of study

This paper deals with a class of ergodic control problems for systems described by Markov chains with strong and weak interactions. These systems are composed of a set of m subchains that are weakly coupled. Using results recently established by Abbad et al. one formulates a limit control problem the solution of which can be obtained via an associated non-differentiable convex programming (NDCP) problem. The technique used to solve the NDCP problem is the Analytic Center Cutting Plane Method (ACCPM) which implements a dialogue between, on one hand, a master program computing the analytical center of a localization set containing the solution and, on the other hand, an oracle proposing cutting planes that reduce the size of the localization set at each main iteration. The interesting aspect of this implementation comes from two characteristics: (i) the oracle proposes cutting planes by solving reduced sized Markov Decision Problems (MDP) via a linear program (LP) or a policy iteration method; (ii) several cutting planes can be proposed simultaneously through a parallel implementation on m processors. The paper concentrates on these two aspects and shows, on a large scale MDP obtained from the numerical approximation "a la Kushner-Dupuis” of a singularly perturbed hybrid stochastic control problem, the important computational speed-up obtained

Flinders Academic Commons

University of Queensland eSpace

A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 06/12/2015
Field of study

We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Ergodic Control and Polyhedral approaches to PageRank Optimization

Author: Akian Marianne
Bouhtou Mustapha
Fercoq Olivier
Gaubert Stéphane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/09/2011
Field of study

We study a general class of PageRank optimization problems which consist in finding an optimal outlink strategy for a web site subject to design constraints. We consider both a continuous problem, in which one can choose the intensity of a link, and a discrete one, in which in each page, there are obligatory links, facultative links and forbidden links. We show that the continuous problem, as well as its discrete variant when there are no constraints coupling different pages, can both be modeled by constrained Markov decision processes with ergodic reward, in which the webmaster determines the transition probabilities of websurfers. Although the number of actions turns out to be exponential, we show that an associated polytope of transition measures has a concise representation, from which we deduce that the continuous problem is solvable in polynomial time, and that the same is true for the discrete problem when there are no coupling constraints. We also provide efficient algorithms, adapted to very large networks. Then, we investigate the qualitative features of optimal outlink strategies, and identify in particular assumptions under which there exists a "master" page to which all controlled pages should point. We report numerical results on fragments of the real web graph.Comment: 39 page

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

An approximation approach for the deviation matrix of continuous-time Markov processes with application to Markov decision theory

Author: Arie Hordijk
Asmussen S.
Bernd Heidergott
Bertsekas D.
Coolen-Schrijner P.
Guo X.
Heidergott B.
Kijima M.
Neuts M.
Nicole Leder
Olsson M.
Riska A.
Ross S.
Tijms H.
Publication venue
Publication date: 01/01/2010
Field of study

We present an update formula that allows the expression of the deviation matrix of a continuous-time Markov process with denumerable state space having generator matrix Q* through a continuous-time Markov process with generator matrix Q. We show that under suitable stability conditions the algorithm converges at a geometric rate. By applying the concept to three different examples, namely, the M/M/1 queue with vacations, the M/G/1 queue, and a tandem network, we illustrate the broad applicability of our approach. For a problem in admission control, we apply our approximation algorithm toMarkov decision theory for computing the optimal control policy. Numerical examples are presented to highlight the efficiency of the proposed algorithm. © 2010 INFORMS

Crossref

VU Research Portal

Perturbation and stability theory for Markov control problems

Author: Abbad Mohammed
Filar Jerzy A
Publication venue: Institute of Electrical and Electronic Engineers
Publication date: 01/01/1992
Field of study

A unified approach to the asymptotic analysis of a Markov decision process disturbed by an ε-additive perturbation is proposed. Irrespective of whether the perturbation is regular or singular, the underlying control problem that needs to be understood is the limit Markov control problem. The properties of this problem are the subject of this study

Flinders Academic Commons

Multi-Automata Learning

Author: Nowe Ann
Peeters Maarten
Verbeeck Katja
Vrancx Peter
Publication venue: 'IntechOpen'
Publication date: 01/01/2008
Field of study

IntechOpen