Search CORE

33 research outputs found

A policy iteration algorithm for zero-sum stochastic games with mean payoff

Author: Jean Cochet-Terrasson
Stéphane Gaubert
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Policy iteration algorithm for zero-sum stochastic games with mean payoff

Author: Cochet-Terrasson Jean
Gaubert Stéphane
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

We give a policy iteration algorithm to solve zero-sum stochastic games with finite state and action spaces and perfect information, when the value is defined in terms of the mean payoff per turn. This algorithm does not require any irreducibility assumption on the Markov chains determined by the strategies of the players. It is based on a discrete nonlinear analogue of the notion of reduction of a super-harmonic function

CiteSeerX

Comptes Rendus Mathématique

INRIA a CCSD electronic archive server

Numérisation de Documents Anciens Mathématiques

An Inverse Method for Policy-Iteration Based Algorithms

Author: Axel Legay
E. Encrenaz
Etienne André
G. A. Paleologo
J. Cochet-terrasson
J. Kemeny
L. Stachniss
Laurent Fribourg
R. A. Howard
R. Alur
R. Bellman
V. Dhingra
É. André
Étienne André
Publication venue: 'Open Publishing Association'
Publication date: 01/11/2009
Field of study

We present an extension of two policy-iteration based algorithms on weighted graphs (viz., Markov Decision Problems and Max-Plus Algebras). This extension allows us to solve the following inverse problem: considering the weights of the graph to be unknown constants or parameters, we suppose that a reference instantiation of those weights is given, and we aim at computing a constraint on the parameters under which an optimal policy for the reference instantiation is still optimal. The original algorithm is thus guaranteed to behave well around the reference instantiation, which provides us with some criteria of robustness. We present an application of both methods to simple examples. A prototype implementation has been done

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Multigrid methods for two-player zero-sum stochastic games

Author: Akian
Akian
Altman
Bank
Bardi
Bardi
Barles
Başar
Bellman
Bensoussan
Bensoussan
Berman
Bertsekas
Bokanowski
Bonnans
Brandt
Brandt
Cochet-Terrasson
Crandall
Davis
Denardo
Denardo
Elliott
Falgout
Fearnley
Filar
Fleming
Fleming
Fleming
Friedman
Friedmann
Hoffman
Hoppe
Hoppe
Howard
Kushner
Kushner
Lions
McEneaney
Mense
Munos
Neyman
Notay
Puterman
Puterman
Rockafellar
Ruge
Shapley
Shimkin
Sorin
Świpolhkech
Publication venue: 'Wiley'
Publication date: 22/11/2011
Field of study

We present a fast numerical algorithm for large scale zero-sum stochastic games with perfect information, which combines policy iteration and algebraic multigrid methods. This algorithm can be applied either to a true finite state space zero-sum two player game or to the discretization of an Isaacs equation. We present numerical tests on discretizations of Isaacs equations or variational inequalities. We also present a full multi-level policy iteration, similar to FMG, which allows to improve substantially the computation time for solving some variational inequalities.Comment: 31 page

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

Using Strategy Improvement to Stay Alive

Author: A. Chakrabarti
A. Ehrenfeucht
A. J. Hoffman
Angelo Montanari
B. V. Cherkassky
B. V. Cherkassky
H. Bj"orklund
J. Chaloupka
J. Chaloupka
J. Cochet-Terrasson
Jakub Chaloupka
L. Brim
L. Brim
L. Doyen
L. Georgiadis
Luboš Brim
Margherita Napoli
Mimmo Parente
P. Bouyer
S. Schewe
U. Zwick
V. A. Gurvich
Y. Lifshits
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2010
Field of study

We design a novel algorithm for solving Mean-Payoff Games (MPGs). Besides solving an MPG in the usual sense, our algorithm computes more information about the game, information that is important with respect to applications. The weights of the edges of an MPG can be thought of as a gained/consumed energy -- depending on the sign. For each vertex, our algorithm computes the minimum amount of initial energy that is sufficient for player Max to ensure that in a play starting from the vertex, the energy level never goes below zero. Our algorithm is not the first algorithm that computes the minimum sufficient initial energies, but according to our experimental study it is the fastest algorithm that computes them. The reason is that it utilizes the strategy improvement technique which is very efficient in practice

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Improving Strategies via SMT Solving

Author: A. Adjé
A. Costan
A. Hoffman
A. Miné
A. Schrijver
B. Blanchet
C. Wrathall
D. Monniaux
D. Monniaux
H. Bjorklund
J. Cochet-Terrasson
J. Leroux
J. Vöge
L. Gonnord
L.J. Stockmeyer
M.A. Colón
M.L. Puterman
N. Halbwachs
N. Megiddo
P. Cousot
R. Howard
R.D. Nicola
S. Sankaranarayanan
S. Sankaranarayanan
T. Gawlitza
T. Gawlitza
T. Gawlitza
T.M. Gawlitza
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We consider the problem of computing numerical invariants of programs by abstract interpretation. Our method eschews two traditional sources of imprecision: (i) the use of widening operators for enforcing convergence within a finite number of iterations (ii) the use of merge operations (often, convex hulls) at the merge points of the control flow graph. It instead computes the least inductive invariant expressible in the domain at a restricted set of program points, and analyzes the rest of the code en bloc. We emphasize that we compute this inductive invariant precisely. For that we extend the strategy improvement algorithm of [Gawlitza and Seidl, 2007]. If we applied their method directly, we would have to solve an exponentially sized system of abstract semantic equations, resulting in memory exhaustion. Instead, we keep the system implicit and discover strategy improvements using SAT modulo real linear arithmetic (SMT). For evaluating strategies we use linear programming. Our algorithm has low polynomial space complexity and performs for contrived examples in the worst case exponentially many strategy improvement steps; this is unsurprising, since we show that the associated abstract reachability problem is Pi-p-2-complete

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

The level set method for the two-sided eigenproblem

Author: B De Schutter
G Cohen
G Olsder
GL Litvinov
H Bjorklund
J Cochet-Terrasson
J Gunawardena
JJ McDonald
M Akian
M Akian
M Akian
M Akian
M Akian
M Develin
P Binding
P Butkovič
R Nussbaum
RA Cuninghame-Green
RA Cuninghame-Green
RH Möhring
S Gaubert
S Gaubert
S Gaubert
S Gaubert
S Sergeev
S Sergeev
Sergeĭ Sergeev
SM Burns
Stéphane Gaubert
TM Liggett
U Zwick
V Mehrmann
X Allamigeon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We consider the max-plus analogue of the eigenproblem for matrix pencils Ax=lambda Bx. We show that the spectrum of (A,B) (i.e., the set of possible values of lambda), which is a finite union of intervals, can be computed in pseudo-polynomial number of operations, by a (pseudo-polynomial) number of calls to an oracle that computes the value of a mean payoff game. The proof relies on the introduction of a spectral function, which we interpret in terms of the least Chebyshev distance between Ax and lambda Bx. The spectrum is obtained as the zero level set of this function.Comment: 34 pages, 4 figures. Changes with respect to the previous version: we explain relation to mean-payoff games and discrete event systems, and show that the reconstruction of spectrum is pseudopolynomia

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Birmingham Research Portal

INRIA a CCSD electronic archive server

HAL-Polytechnique

Algorithmes d'itération sur les politiques pour les applications monotones contractantes

Author: COCHET-TERRASSON Jean
QUADRAT Jean-Pierre
Publication venue
Publication date: 01/01/2001
Field of study

PARIS-MINES ParisTech (751062310) / SudocSudocFranceF

OpenGrey Repository

OPTIMAL CONTROL SYNTHESIS IN INTERVAL DESCRIPTOR SYSTEMS APPLICATION TO TIME STREAM EVENT GRAPHS

Author: Baccelli
Cochet-Terrasson
Courtiat
Declerck
Diaz
Zad
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

Crossref