5,415 research outputs found

    Model-free reinforcement learning for stochastic parity games

    Get PDF
    This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 112-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions

    An Exponential Lower Bound for the Latest Deterministic Strategy Iteration Algorithms

    Full text link
    This paper presents a new exponential lower bound for the two most popular deterministic variants of the strategy improvement algorithms for solving parity, mean payoff, discounted payoff and simple stochastic games. The first variant improves every node in each step maximizing the current valuation locally, whereas the second variant computes the globally optimal improvement in each step. We outline families of games on which both variants require exponentially many strategy iterations

    Approximating the Value of Energy-Parity Objectives in Simple Stochastic Games

    Get PDF
    We consider simple stochastic games G with energy-parity objectives, a combination of quantitative rewards with a qualitative parity condition. The Maximizer tries to avoid running out of energy while simultaneously satisfying a parity condition. We present an algorithm to approximate the value of a given configuration in 2-NEXPTIME. Moreover, ?-optimal strategies for either player require at most O(2-EXP(|G|)?log(1/?)) memory modes

    On the Complexity of Branching Games with Regular Conditions

    Get PDF
    Infinite duration games with regular conditions are one of the crucial tools in the areas of verification and synthesis. In this paper we consider a branching variant of such games - the game contains branching vertices that split the play into two independent sub-games. Thus, a play has the form of~an~infinite tree. The winner of the play is determined by a winning condition specified as a set of infinite trees. Games of this kind were used by Mio to provide a game semantics for the probabilistic mu-calculus. He used winning conditions defined in terms of parity games on trees. In this work we consider a more general class of winning conditions, namely those definable by finite automata on infinite trees. Our games can be seen as a branching-time variant of the stochastic games on graphs. We address the question of determinacy of a branching game and the problem of computing the optimal game value for each of the players. We consider both the stochastic and non-stochastic variants of the games. The questions under consideration are parametrised by the family of strategies we allow: either mixed, behavioural, or pure. We prove that in general, branching games are not determined under mixed strategies. This holds even for topologically simple winning conditions (differences of two open sets) and non-stochastic arenas. Nevertheless, we show that the games become determined under mixed strategies if we restrict the winning conditions to open sets of trees. We prove that the problem of comparing the game value to a rational threshold is undecidable for branching games with regular conditions in all non-trivial stochastic cases. In the non-stochastic cases we provide exact bounds on the complexity of the problem. The only case left open is the 0-player stochastic case, i.e. the problem of computing the measure of a given regular language of infinite trees

    Obligation Blackwell Games and p-Automata

    Full text link
    We recently introduced p-automata, automata that read discrete-time Markov chains. We used turn-based stochastic parity games to define acceptance of Markov chains by a subclass of p-automata. Definition of acceptance required a cumbersome and complicated reduction to a series of turn-based stochastic parity games. The reduction could not support acceptance by general p-automata, which was left undefined as there was no notion of games that supported it. Here we generalize two-player games by adding a structural acceptance condition called obligations. Obligations are orthogonal to the linear winning conditions that define winning. Obligations are a declaration that player 0 can achieve a certain value from a configuration. If the obligation is met, the value of that configuration for player 0 is 1. One cannot define value in obligation games by the standard mechanism of considering the measure of winning paths on a Markov chain and taking the supremum of the infimum of all strategies. Mainly because obligations need definition even for Markov chains and the nature of obligations has the flavor of an infinite nesting of supremum and infimum operators. We define value via a reduction to turn-based games similar to Martin's proof of determinacy of Blackwell games with Borel objectives. Based on this definition, we show that games are determined. We show that for Markov chains with Borel objectives and obligations, and finite turn-based stochastic parity games with obligations there exists an alternative and simpler characterization of the value function. Based on this simpler definition we give an exponential time algorithm to analyze finite turn-based stochastic parity games with obligations. Finally, we show that obligation games provide the necessary framework for reasoning about p-automata and that they generalize the previous definition

    Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games

    Get PDF
    2.5 player parity games combine the challenges posed by 2.5 player reachability games and the qualitative analysis of parity games. These two types of problems are best approached with different types of algorithms: strategy improvement algorithms for 2.5 player reachability games and recursive algorithms for the qualitative analysis of parity games. We present a method that - in contrast to existing techniques - tackles both aspects with the best suited approach and works exclusively on the 2.5 player game itself. The resulting technique is powerful enough to handle games with several million states

    Decision Problems for Nash Equilibria in Stochastic Games

    Get PDF
    We analyse the computational complexity of finding Nash equilibria in stochastic multiplayer games with ω\omega-regular objectives. While the existence of an equilibrium whose payoff falls into a certain interval may be undecidable, we single out several decidable restrictions of the problem. First, restricting the search space to stationary, or pure stationary, equilibria results in problems that are typically contained in PSPACE and NP, respectively. Second, we show that the existence of an equilibrium with a binary payoff (i.e. an equilibrium where each player either wins or loses with probability 1) is decidable. We also establish that the existence of a Nash equilibrium with a certain binary payoff entails the existence of an equilibrium with the same payoff in pure, finite-state strategies.Comment: 22 pages, revised versio

    The Complexity of All-switches Strategy Improvement

    Get PDF
    Strategy improvement is a widely-used and well-studied class of algorithms for solving graph-based infinite games. These algorithms are parameterized by a switching rule, and one of the most natural rules is "all switches" which switches as many edges as possible in each iteration. Continuing a recent line of work, we study all-switches strategy improvement from the perspective of computational complexity. We consider two natural decision problems, both of which have as input a game GG, a starting strategy ss, and an edge ee. The problems are: 1.) The edge switch problem, namely, is the edge ee ever switched by all-switches strategy improvement when it is started from ss on game GG? 2.) The optimal strategy problem, namely, is the edge ee used in the final strategy that is found by strategy improvement when it is started from ss on game GG? We show PSPACE\mathtt{PSPACE}-completeness of the edge switch problem and optimal strategy problem for the following settings: Parity games with the discrete strategy improvement algorithm of V\"oge and Jurdzi\'nski; mean-payoff games with the gain-bias algorithm [14,37]; and discounted-payoff games and simple stochastic games with their standard strategy improvement algorithms. We also show PSPACE\mathtt{PSPACE}-completeness of an analogous problem to edge switch for the bottom-antipodal algorithm for finding the sink of an Acyclic Unique Sink Orientation on a cube

    Tree games with regular objectives

    Full text link
    We study tree games developed recently by Matteo Mio as a game interpretation of the probabilistic μ\mu-calculus. With expressive power comes complexity. Mio showed that tree games are able to encode Blackwell games and, consequently, are not determined under deterministic strategies. We show that non-stochastic tree games with objectives recognisable by so-called game automata are determined under deterministic, finite memory strategies. Moreover, we give an elementary algorithmic procedure which, for an arbitrary regular language L and a finite non-stochastic tree game with a winning objective L decides if the game is determined under deterministic strategies.Comment: In Proceedings GandALF 2014, arXiv:1408.556
    • …
    corecore