33 research outputs found
Policy iteration algorithm for zero-sum stochastic games with mean payoff
We give a policy iteration algorithm to solve zero-sum stochastic games with finite state and action spaces and perfect information, when the value is defined in terms of the mean payoff per turn. This algorithm does not require any irreducibility assumption on the Markov chains determined by the strategies of the players. It is based on a discrete nonlinear analogue of the notion of reduction of a super-harmonic function
An Inverse Method for Policy-Iteration Based Algorithms
We present an extension of two policy-iteration based algorithms on weighted
graphs (viz., Markov Decision Problems and Max-Plus Algebras). This extension
allows us to solve the following inverse problem: considering the weights of
the graph to be unknown constants or parameters, we suppose that a reference
instantiation of those weights is given, and we aim at computing a constraint
on the parameters under which an optimal policy for the reference instantiation
is still optimal. The original algorithm is thus guaranteed to behave well
around the reference instantiation, which provides us with some criteria of
robustness. We present an application of both methods to simple examples. A
prototype implementation has been done
Multigrid methods for two-player zero-sum stochastic games
We present a fast numerical algorithm for large scale zero-sum stochastic
games with perfect information, which combines policy iteration and algebraic
multigrid methods. This algorithm can be applied either to a true finite state
space zero-sum two player game or to the discretization of an Isaacs equation.
We present numerical tests on discretizations of Isaacs equations or
variational inequalities. We also present a full multi-level policy iteration,
similar to FMG, which allows to improve substantially the computation time for
solving some variational inequalities.Comment: 31 page
Using Strategy Improvement to Stay Alive
We design a novel algorithm for solving Mean-Payoff Games (MPGs). Besides
solving an MPG in the usual sense, our algorithm computes more information
about the game, information that is important with respect to applications. The
weights of the edges of an MPG can be thought of as a gained/consumed energy --
depending on the sign. For each vertex, our algorithm computes the minimum
amount of initial energy that is sufficient for player Max to ensure that in a
play starting from the vertex, the energy level never goes below zero. Our
algorithm is not the first algorithm that computes the minimum sufficient
initial energies, but according to our experimental study it is the fastest
algorithm that computes them. The reason is that it utilizes the strategy
improvement technique which is very efficient in practice
Improving Strategies via SMT Solving
We consider the problem of computing numerical invariants of programs by
abstract interpretation. Our method eschews two traditional sources of
imprecision: (i) the use of widening operators for enforcing convergence within
a finite number of iterations (ii) the use of merge operations (often, convex
hulls) at the merge points of the control flow graph. It instead computes the
least inductive invariant expressible in the domain at a restricted set of
program points, and analyzes the rest of the code en bloc. We emphasize that we
compute this inductive invariant precisely. For that we extend the strategy
improvement algorithm of [Gawlitza and Seidl, 2007]. If we applied their method
directly, we would have to solve an exponentially sized system of abstract
semantic equations, resulting in memory exhaustion. Instead, we keep the system
implicit and discover strategy improvements using SAT modulo real linear
arithmetic (SMT). For evaluating strategies we use linear programming. Our
algorithm has low polynomial space complexity and performs for contrived
examples in the worst case exponentially many strategy improvement steps; this
is unsurprising, since we show that the associated abstract reachability
problem is Pi-p-2-complete
The level set method for the two-sided eigenproblem
We consider the max-plus analogue of the eigenproblem for matrix pencils
Ax=lambda Bx. We show that the spectrum of (A,B) (i.e., the set of possible
values of lambda), which is a finite union of intervals, can be computed in
pseudo-polynomial number of operations, by a (pseudo-polynomial) number of
calls to an oracle that computes the value of a mean payoff game. The proof
relies on the introduction of a spectral function, which we interpret in terms
of the least Chebyshev distance between Ax and lambda Bx. The spectrum is
obtained as the zero level set of this function.Comment: 34 pages, 4 figures. Changes with respect to the previous version: we
explain relation to mean-payoff games and discrete event systems, and show
that the reconstruction of spectrum is pseudopolynomia
Algorithmes d'itération sur les politiques pour les applications monotones contractantes
PARIS-MINES ParisTech (751062310) / SudocSudocFranceF