33 research outputs found

    Policy iteration algorithm for zero-sum stochastic games with mean payoff

    Get PDF
    We give a policy iteration algorithm to solve zero-sum stochastic games with finite state and action spaces and perfect information, when the value is defined in terms of the mean payoff per turn. This algorithm does not require any irreducibility assumption on the Markov chains determined by the strategies of the players. It is based on a discrete nonlinear analogue of the notion of reduction of a super-harmonic function

    An Inverse Method for Policy-Iteration Based Algorithms

    Full text link
    We present an extension of two policy-iteration based algorithms on weighted graphs (viz., Markov Decision Problems and Max-Plus Algebras). This extension allows us to solve the following inverse problem: considering the weights of the graph to be unknown constants or parameters, we suppose that a reference instantiation of those weights is given, and we aim at computing a constraint on the parameters under which an optimal policy for the reference instantiation is still optimal. The original algorithm is thus guaranteed to behave well around the reference instantiation, which provides us with some criteria of robustness. We present an application of both methods to simple examples. A prototype implementation has been done

    Multigrid methods for two-player zero-sum stochastic games

    Full text link
    We present a fast numerical algorithm for large scale zero-sum stochastic games with perfect information, which combines policy iteration and algebraic multigrid methods. This algorithm can be applied either to a true finite state space zero-sum two player game or to the discretization of an Isaacs equation. We present numerical tests on discretizations of Isaacs equations or variational inequalities. We also present a full multi-level policy iteration, similar to FMG, which allows to improve substantially the computation time for solving some variational inequalities.Comment: 31 page

    Using Strategy Improvement to Stay Alive

    Full text link
    We design a novel algorithm for solving Mean-Payoff Games (MPGs). Besides solving an MPG in the usual sense, our algorithm computes more information about the game, information that is important with respect to applications. The weights of the edges of an MPG can be thought of as a gained/consumed energy -- depending on the sign. For each vertex, our algorithm computes the minimum amount of initial energy that is sufficient for player Max to ensure that in a play starting from the vertex, the energy level never goes below zero. Our algorithm is not the first algorithm that computes the minimum sufficient initial energies, but according to our experimental study it is the fastest algorithm that computes them. The reason is that it utilizes the strategy improvement technique which is very efficient in practice

    Improving Strategies via SMT Solving

    Full text link
    We consider the problem of computing numerical invariants of programs by abstract interpretation. Our method eschews two traditional sources of imprecision: (i) the use of widening operators for enforcing convergence within a finite number of iterations (ii) the use of merge operations (often, convex hulls) at the merge points of the control flow graph. It instead computes the least inductive invariant expressible in the domain at a restricted set of program points, and analyzes the rest of the code en bloc. We emphasize that we compute this inductive invariant precisely. For that we extend the strategy improvement algorithm of [Gawlitza and Seidl, 2007]. If we applied their method directly, we would have to solve an exponentially sized system of abstract semantic equations, resulting in memory exhaustion. Instead, we keep the system implicit and discover strategy improvements using SAT modulo real linear arithmetic (SMT). For evaluating strategies we use linear programming. Our algorithm has low polynomial space complexity and performs for contrived examples in the worst case exponentially many strategy improvement steps; this is unsurprising, since we show that the associated abstract reachability problem is Pi-p-2-complete

    The level set method for the two-sided eigenproblem

    Full text link
    We consider the max-plus analogue of the eigenproblem for matrix pencils Ax=lambda Bx. We show that the spectrum of (A,B) (i.e., the set of possible values of lambda), which is a finite union of intervals, can be computed in pseudo-polynomial number of operations, by a (pseudo-polynomial) number of calls to an oracle that computes the value of a mean payoff game. The proof relies on the introduction of a spectral function, which we interpret in terms of the least Chebyshev distance between Ax and lambda Bx. The spectrum is obtained as the zero level set of this function.Comment: 34 pages, 4 figures. Changes with respect to the previous version: we explain relation to mean-payoff games and discrete event systems, and show that the reconstruction of spectrum is pseudopolynomia

    Algorithmes d'itération sur les politiques pour les applications monotones contractantes

    No full text
    PARIS-MINES ParisTech (751062310) / SudocSudocFranceF
    corecore