Search CORE

5,415 research outputs found

Model-free reinforcement learning for stochastic parity games

Author: Hahn EM
Perez M
Schewe S
Somenzi F
Trivedi A
Wojtczak D
Publication venue
Publication date: 01/01/2020
Field of study

This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 112-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions

University of Liverpool Repository

Dagstuhl Research Online Publication Server

University of Twente Research Information

An Exponential Lower Bound for the Latest Deterministic Strategy Iteration Algorithms

Author: A. Ehrenfeucht and J. Mycielski
Anne Condon
Henrik Björklund and Sergei Vorobyov
Leonid Khachiyan
M. Jurdznski
Nir Piterman
Oliver Friedmann
Oliver Friedmann
Uri Zwick and Mike Paterson
W. Zielonka
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2010
Field of study

This paper presents a new exponential lower bound for the two most popular deterministic variants of the strategy improvement algorithms for solving parity, mean payoff, discounted payoff and simple stochastic games. The first variant improves every node in each step maximizing the current valuation locally, whereas the second variant computes the globally optimal improvement in each step. We outline families of games on which both variants require exponentially many strategy iterations

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Directory of Open Access Journals

Approximating the Value of Energy-Parity Objectives in Simple Stochastic Games

Author: Dantam Mohan
Mayr Richard
Publication venue
Publication date: 01/01/2023
Field of study

We consider simple stochastic games G with energy-parity objectives, a combination of quantitative rewards with a qualitative parity condition. The Maximizer tries to avoid running out of energy while simultaneously satisfying a parity condition. We present an algorithm to approximate the value of a given configuration in 2-NEXPTIME. Moreover, ?-optimal strategies for either player require at most O(2-EXP(|G|)?log(1/?)) memory modes

Edinburgh Research Explorer

Dagstuhl Research Online Publication Server

On the Complexity of Branching Games with Regular Conditions

Author: Przybylko Marcin
Skrzypczak Michal
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 41st International Symposium on Mathematical Foundations of Computer Science (MFCS 2016)
Publication date: 01/01/2016
Field of study

Infinite duration games with regular conditions are one of the crucial tools in the areas of verification and synthesis. In this paper we consider a branching variant of such games - the game contains branching vertices that split the play into two independent sub-games. Thus, a play has the form of~an~infinite tree. The winner of the play is determined by a winning condition specified as a set of infinite trees. Games of this kind were used by Mio to provide a game semantics for the probabilistic mu-calculus. He used winning conditions defined in terms of parity games on trees. In this work we consider a more general class of winning conditions, namely those definable by finite automata on infinite trees. Our games can be seen as a branching-time variant of the stochastic games on graphs. We address the question of determinacy of a branching game and the problem of computing the optimal game value for each of the players. We consider both the stochastic and non-stochastic variants of the games. The questions under consideration are parametrised by the family of strategies we allow: either mixed, behavioural, or pure. We prove that in general, branching games are not determined under mixed strategies. This holds even for topologically simple winning conditions (differences of two open sets) and non-stochastic arenas. Nevertheless, we show that the games become determined under mixed strategies if we restrict the winning conditions to open sets of trees. We prove that the problem of comparing the game value to a rational threshold is undecidable for branching games with regular conditions in all non-trivial stochastic cases. In the non-stochastic cases we provide exact bounds on the complexity of the problem. The only case left open is the 0-player stochastic case, i.e. the problem of computing the measure of a given regular language of infinite trees

Dagstuhl Research Online Publication Server

Obligation Blackwell Games and p-Automata

Author: Chatterjee Krishnendu
Piterman Nir
Publication venue
Publication date: 03/11/2013
Field of study

We recently introduced p-automata, automata that read discrete-time Markov chains. We used turn-based stochastic parity games to define acceptance of Markov chains by a subclass of p-automata. Definition of acceptance required a cumbersome and complicated reduction to a series of turn-based stochastic parity games. The reduction could not support acceptance by general p-automata, which was left undefined as there was no notion of games that supported it. Here we generalize two-player games by adding a structural acceptance condition called obligations. Obligations are orthogonal to the linear winning conditions that define winning. Obligations are a declaration that player 0 can achieve a certain value from a configuration. If the obligation is met, the value of that configuration for player 0 is 1. One cannot define value in obligation games by the standard mechanism of considering the measure of winning paths on a Markov chain and taking the supremum of the infimum of all strategies. Mainly because obligations need definition even for Markov chains and the nature of obligations has the flavor of an infinite nesting of supremum and infimum operators. We define value via a reduction to turn-based games similar to Martin's proof of determinacy of Blackwell games with Borel objectives. Based on this definition, we show that games are determined. We show that for Markov chains with Borel objectives and obligations, and finite turn-based stochastic parity games with obligations there exists an alternative and simpler characterization of the value function. Based on this simpler definition we give an exponential time algorithm to analyze finite turn-based stochastic parity games with obligations. Finally, we show that obligation games provide the necessary framework for reasoning about p-automata and that they generalize the previous definition

arXiv.org e-Print Archive

CiteSeerX

Leicester Research Archive

Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games

Author: Hahn Ernst Moritz
Schewe Sven
Turrini Andrea
Zhang Lijun
Publication venue
Publication date: 05/07/2016
Field of study

2.5 player parity games combine the challenges posed by 2.5 player reachability games and the qualitative analysis of parity games. These two types of problems are best approached with different types of algorithms: strategy improvement algorithms for 2.5 player reachability games and recursive algorithms for the qualitative analysis of parity games. We present a method that - in contrast to existing techniques - tackles both aspects with the best suited approach and works exclusively on the 2.5 player game itself. The resulting technique is powerful enough to handle games with several million states

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

Decision Problems for Nash Equilibria in Stochastic Games

Author: A. Condon
C. Daskalakis
C.A. Courcoubetis
C.A. Courcoubetis
D.A. Martin
E. Allender
E.A. Emerson
J. Canny
J.F. Nash Jr.
K. Chatterjee
K. Chatterjee
K. Chatterjee
K. Chatterjee
L. Alfaro de
M. Ummels
M. Ummels
M.J. Osborne
M.L. Puterman
P. Hunter
W. Thomas
X. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We analyse the computational complexity of finding Nash equilibria in stochastic multiplayer games with

\omega

-regular objectives. While the existence of an equilibrium whose payoff falls into a certain interval may be undecidable, we single out several decidable restrictions of the problem. First, restricting the search space to stationary, or pure stationary, equilibria results in problems that are typically contained in PSPACE and NP, respectively. Second, we show that the existence of an equilibrium with a binary payoff (i.e. an equilibrium where each player either wins or loses with probability 1) is decidable. We also establish that the existence of a Nash equilibrium with a certain binary payoff entails the existence of an equilibrium with the same payoff in pure, finite-state strategies.Comment: 22 pages, revised versio

arXiv.org e-Print Archive

CiteSeerX

Crossref

CWI's Institutional Repository

Publikationsserver der RWTH Aachen University

The Complexity of All-switches Strategy Improvement

Author: Fearnley John
Savani Rahul
Publication venue
Publication date: 01/01/2018
Field of study

Strategy improvement is a widely-used and well-studied class of algorithms for solving graph-based infinite games. These algorithms are parameterized by a switching rule, and one of the most natural rules is "all switches" which switches as many edges as possible in each iteration. Continuing a recent line of work, we study all-switches strategy improvement from the perspective of computational complexity. We consider two natural decision problems, both of which have as input a game

G

, a starting strategy

s

, and an edge

e

. The problems are: 1.) The edge switch problem, namely, is the edge

e

ever switched by all-switches strategy improvement when it is started from

s

on game

G

? 2.) The optimal strategy problem, namely, is the edge

e

used in the final strategy that is found by strategy improvement when it is started from

s

on game

G

? We show

\mathtt{PSPACE}

-completeness of the edge switch problem and optimal strategy problem for the following settings: Parity games with the discrete strategy improvement algorithm of V\"oge and Jurdzi\'nski; mean-payoff games with the gain-bias algorithm [14,37]; and discounted-payoff games and simple stochastic games with their standard strategy improvement algorithms. We also show

\mathtt{PSPACE}

-completeness of an analogous problem to edge switch for the bottom-antipodal algorithm for finding the sink of an Acyclic Unique Sink Orientation on a cube

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

Episciences.org

Directory of Open Access Journals

Tree games with regular objectives

Author: Przybyłko Marcin
Publication venue: 'Open Publishing Association'
Publication date: 01/08/2014
Field of study

We study tree games developed recently by Matteo Mio as a game interpretation of the probabilistic

\mu

-calculus. With expressive power comes complexity. Mio showed that tree games are able to encode Blackwell games and, consequently, are not determined under deterministic strategies. We show that non-stochastic tree games with objectives recognisable by so-called game automata are determined under deterministic, finite memory strategies. Moreover, we give an elementary algorithmic procedure which, for an arbitrary regular language L and a finite non-stochastic tree game with a winning objective L decides if the game is determined under deterministic strategies.Comment: In Proceedings GandALF 2014, arXiv:1408.556

arXiv.org e-Print Archive

Directory of Open Access Journals