611 research outputs found
Strategy improvement for concurrent reachability and turn based stochastic safety games
We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all ε>0, memoryless ε-optimal strategies exist. A memoryless strategy is independent of the history of plays, and an ε-optimal strategy achieves the objective with probability within ε of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategy-improvement (a.k.a. policy-iteration) algorithm for concurrent games with reachability objectives. Finally, we present a strategy-improvement algorithm for turn-based stochastic games (where each player selects moves in turns) with safety objectives. Our algorithms yield sequences of player-1 strategies which ensure probabilities of winning that converge monotonically (from below) to the value of the game. © 2012 Elsevier Inc
Termination Criteria for Solving Concurrent Safety and Reachability Games
We consider concurrent games played on graphs. At every round of a game, each
player simultaneously and independently selects a move; the moves jointly
determine the transition to a successor state. Two basic objectives are the
safety objective to stay forever in a given set of states, and its dual, the
reachability objective to reach a given set of states. We present in this paper
a strategy improvement algorithm for computing the value of a concurrent safety
game, that is, the maximal probability with which player~1 can enforce the
safety objective. The algorithm yields a sequence of player-1 strategies which
ensure probabilities of winning that converge monotonically to the value of the
safety game.
Our result is significant because the strategy improvement algorithm
provides, for the first time, a way to approximate the value of a concurrent
safety game from below. Since a value iteration algorithm, or a strategy
improvement algorithm for reachability games, can be used to approximate the
same value from above, the combination of both algorithms yields a method for
computing a converging sequence of upper and lower bounds for the values of
concurrent reachability and safety games. Previous methods could approximate
the values of these games only from one direction, and as no rates of
convergence are known, they did not provide a practical way to solve these
games
Strategy Improvement for Concurrent Safety Games
We consider concurrent games played on graphs. At every round of the game,
each player simultaneously and independently selects a move; the moves jointly
determine the transition to a successor state. Two basic objectives are the
safety objective: ``stay forever in a set F of states'', and its dual, the
reachability objective, ``reach a set R of states''. We present in this paper a
strategy improvement algorithm for computing the value of a concurrent safety
game, that is, the maximal probability with which player 1 can enforce the
safety objective. The algorithm yields a sequence of player-1 strategies which
ensure probabilities of winning that converge monotonically to the value of the
safety game.
The significance of the result is twofold. First, while strategy improvement
algorithms were known for Markov decision processes and turn-based games, as
well as for concurrent reachability games, this is the first strategy
improvement algorithm for concurrent safety games. Second, and most
importantly, the improvement algorithm provides a way to approximate the value
of a concurrent safety game from below (the known value-iteration algorithms
approximate the value from above). Thus, when used together with
value-iteration algorithms, or with strategy improvement algorithms for
reachability games, our algorithm leads to the first practical algorithm for
computing converging upper and lower bounds for the value of reachability and
safety games.Comment: 19 pages, 1 figur
A survey of stochastic ω regular games
We summarize classical and recent results about two-player games played on graphs with ω-regular objectives. These games have applications in the verification and synthesis of reactive systems. Important distinctions are whether a graph game is turn-based or concurrent; deterministic or stochastic; zero-sum or not. We cluster known results and open problems according to these classifications
Pure Nash Equilibria in Concurrent Deterministic Games
We study pure-strategy Nash equilibria in multi-player concurrent
deterministic games, for a variety of preference relations. We provide a novel
construction, called the suspect game, which transforms a multi-player
concurrent game into a two-player turn-based game which turns Nash equilibria
into winning strategies (for some objective that depends on the preference
relations of the players in the original game). We use that transformation to
design algorithms for computing Nash equilibria in finite games, which in most
cases have optimal worst-case complexity, for large classes of preference
relations. This includes the purely qualitative framework, where each player
has a single omega-regular objective that she wants to satisfy, but also the
larger class of semi-quantitative objectives, where each player has several
omega-regular objectives equipped with a preorder (for instance, a player may
want to satisfy all her objectives, or to maximise the number of objectives
that she achieves.)Comment: 72 page
Equilibria-based Probabilistic Model Checking for Concurrent Stochastic Games
Probabilistic model checking for stochastic games enables formal verification
of systems that comprise competing or collaborating entities operating in a
stochastic environment. Despite good progress in the area, existing approaches
focus on zero-sum goals and cannot reason about scenarios where entities are
endowed with different objectives. In this paper, we propose probabilistic
model checking techniques for concurrent stochastic games based on Nash
equilibria. We extend the temporal logic rPATL (probabilistic alternating-time
temporal logic with rewards) to allow reasoning about players with distinct
quantitative goals, which capture either the probability of an event occurring
or a reward measure. We present algorithms to synthesise strategies that are
subgame perfect social welfare optimal Nash equilibria, i.e., where there is no
incentive for any players to unilaterally change their strategy in any state of
the game, whilst the combined probabilities or rewards are maximised. We
implement our techniques in the PRISM-games tool and apply them to several case
studies, including network protocols and robot navigation, showing the benefits
compared to existing approaches
Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games
2.5 player parity games combine the challenges posed by 2.5 player
reachability games and the qualitative analysis of parity games. These two
types of problems are best approached with different types of algorithms:
strategy improvement algorithms for 2.5 player reachability games and recursive
algorithms for the qualitative analysis of parity games. We present a method
that - in contrast to existing techniques - tackles both aspects with the best
suited approach and works exclusively on the 2.5 player game itself. The
resulting technique is powerful enough to handle games with several million
states
Bounded Rationality in Concurrent Parity Games
We consider 2-player games played on a finite state space for infinite
rounds. The games are concurrent: in each round, the two players choose their
moves simultaneously; the current state and the moves determine the successor.
We consider omega-regular winning conditions given as parity objectives. We
consider the qualitative analysis problems: the computation of the almost-sure
and limit-sure winning set of states, where player 1 can ensure to win with
probability 1 and with probability arbitrarily close to 1, respectively. In
general the almost-sure and limit-sure winning strategies require both
infinite-memory and infinite-precision. We study the bounded-rationality
problem for qualitative analysis of concurrent parity games, where the strategy
set player 1 is restricted to bounded-resource strategies. In terms of
precision, strategies can be deterministic, uniform, finite-precision or
infinite-precision; and in terms of memory, strategies can be memoryless,
finite-memory or infinite-memory. We present a precise and complete
characterization of the qualitative winning sets for all combinations of
classes of strategies. In particular, we show that uniform memoryless
strategies are as powerful as finite-precision infinite-memory strategies, and
infinite-precision memoryless strategies are as powerful as infinite-precision
finite-memory strategies. We show that the winning sets can be computed in
O(n^{2d+3}) time, where n is the size of the game and 2d is the number of
priorities, and our algorithms are symbolic. The membership problem of whether
a state belongs to a winning set can be decided in NP cap coNP. While this
complexity is the same as for the simpler class of turn-based games, where in
each state only one of the players has a choice of moves, our algorithms, that
are obtained by characterization of the winning sets as mu-calculus formulas,
are considerably more involved
- …