355 research outputs found

    Decomposition and parallel processing techniques for two-time scale controlled Markov chains

    Get PDF
    This paper deals with a class of ergodic control problems for systems described by Markov chains with strong and weak interactions. These systems are composed of a set of m subchains that are weakly coupled. Using results recently established by Abbad et al. one formulates a limit control problem the solution of which can be obtained via an associated non-differentiable convex programming (NDCP) problem. The technique used to solve the NDCP problem is the Analytic Center Cutting Plane Method (ACCPM) which implements a dialogue between, on one hand, a master program computing the analytical center of a localization set containing the solution and, on the other hand, an oracle proposing cutting planes that reduce the size of the localization set at each main iteration. The interesting aspect of this implementation comes from two characteristics: (i) the oracle proposes cutting planes by solving reduced sized Markov Decision Problems (MDP) via a linear program (LP) or a policy iteration method; (ii) several cutting planes can be proposed simultaneously through a parallel implementation on m processors. The paper concentrates on these two aspects and shows, on a large scale MDP obtained from the numerical approximation "a la Kushner-Dupuis” of a singularly perturbed hybrid stochastic control problem, the important computational speed-up obtained

    A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

    Get PDF
    We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

    Ergodic Control and Polyhedral approaches to PageRank Optimization

    Full text link
    We study a general class of PageRank optimization problems which consist in finding an optimal outlink strategy for a web site subject to design constraints. We consider both a continuous problem, in which one can choose the intensity of a link, and a discrete one, in which in each page, there are obligatory links, facultative links and forbidden links. We show that the continuous problem, as well as its discrete variant when there are no constraints coupling different pages, can both be modeled by constrained Markov decision processes with ergodic reward, in which the webmaster determines the transition probabilities of websurfers. Although the number of actions turns out to be exponential, we show that an associated polytope of transition measures has a concise representation, from which we deduce that the continuous problem is solvable in polynomial time, and that the same is true for the discrete problem when there are no coupling constraints. We also provide efficient algorithms, adapted to very large networks. Then, we investigate the qualitative features of optimal outlink strategies, and identify in particular assumptions under which there exists a "master" page to which all controlled pages should point. We report numerical results on fragments of the real web graph.Comment: 39 page

    An approximation approach for the deviation matrix of continuous-time Markov processes with application to Markov decision theory

    Get PDF
    We present an update formula that allows the expression of the deviation matrix of a continuous-time Markov process with denumerable state space having generator matrix Q* through a continuous-time Markov process with generator matrix Q. We show that under suitable stability conditions the algorithm converges at a geometric rate. By applying the concept to three different examples, namely, the M/M/1 queue with vacations, the M/G/1 queue, and a tandem network, we illustrate the broad applicability of our approach. For a problem in admission control, we apply our approximation algorithm toMarkov decision theory for computing the optimal control policy. Numerical examples are presented to highlight the efficiency of the proposed algorithm. © 2010 INFORMS

    Perturbation and stability theory for Markov control problems

    Get PDF
    A unified approach to the asymptotic analysis of a Markov decision process disturbed by an ε-additive perturbation is proposed. Irrespective of whether the perturbation is regular or singular, the underlying control problem that needs to be understood is the limit Markov control problem. The properties of this problem are the subject of this study

    Multi-Automata Learning

    Get PDF
    • …
    corecore