14 research outputs found
Termination Criteria for Solving Concurrent Safety and Reachability Games
We consider concurrent games played on graphs. At every round of a game, each
player simultaneously and independently selects a move; the moves jointly
determine the transition to a successor state. Two basic objectives are the
safety objective to stay forever in a given set of states, and its dual, the
reachability objective to reach a given set of states. We present in this paper
a strategy improvement algorithm for computing the value of a concurrent safety
game, that is, the maximal probability with which player~1 can enforce the
safety objective. The algorithm yields a sequence of player-1 strategies which
ensure probabilities of winning that converge monotonically to the value of the
safety game.
Our result is significant because the strategy improvement algorithm
provides, for the first time, a way to approximate the value of a concurrent
safety game from below. Since a value iteration algorithm, or a strategy
improvement algorithm for reachability games, can be used to approximate the
same value from above, the combination of both algorithms yields a method for
computing a converging sequence of upper and lower bounds for the values of
concurrent reachability and safety games. Previous methods could approximate
the values of these games only from one direction, and as no rates of
convergence are known, they did not provide a practical way to solve these
games
Strategy Improvement for Concurrent Safety Games
We consider concurrent games played on graphs. At every round of the game,
each player simultaneously and independently selects a move; the moves jointly
determine the transition to a successor state. Two basic objectives are the
safety objective: ``stay forever in a set F of states'', and its dual, the
reachability objective, ``reach a set R of states''. We present in this paper a
strategy improvement algorithm for computing the value of a concurrent safety
game, that is, the maximal probability with which player 1 can enforce the
safety objective. The algorithm yields a sequence of player-1 strategies which
ensure probabilities of winning that converge monotonically to the value of the
safety game.
The significance of the result is twofold. First, while strategy improvement
algorithms were known for Markov decision processes and turn-based games, as
well as for concurrent reachability games, this is the first strategy
improvement algorithm for concurrent safety games. Second, and most
importantly, the improvement algorithm provides a way to approximate the value
of a concurrent safety game from below (the known value-iteration algorithms
approximate the value from above). Thus, when used together with
value-iteration algorithms, or with strategy improvement algorithms for
reachability games, our algorithm leads to the first practical algorithm for
computing converging upper and lower bounds for the value of reachability and
safety games.Comment: 19 pages, 1 figur
Strategy improvement for concurrent reachability and turn based stochastic safety games
We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all ε>0, memoryless ε-optimal strategies exist. A memoryless strategy is independent of the history of plays, and an ε-optimal strategy achieves the objective with probability within ε of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategy-improvement (a.k.a. policy-iteration) algorithm for concurrent games with reachability objectives. Finally, we present a strategy-improvement algorithm for turn-based stochastic games (where each player selects moves in turns) with safety objectives. Our algorithms yield sequences of player-1 strategies which ensure probabilities of winning that converge monotonically (from below) to the value of the game. © 2012 Elsevier Inc
Comparison of Algorithms for Simple Stochastic Games (Full Version)
Simple stochastic games are turn-based 2.5-player zero-sum graph games with a
reachability objective. The problem is to compute the winning probability as
well as the optimal strategies of both players. In this paper, we compare the
three known classes of algorithms -- value iteration, strategy iteration and
quadratic programming -- both theoretically and practically. Further, we
suggest several improvements for all algorithms, including the first approach
based on quadratic programming that avoids transforming the stochastic game to
a stopping one. Our extensive experiments show that these improvements can lead
to significant speed-ups. We implemented all algorithms in PRISM-games 3.0,
thereby providing the first implementation of quadratic programming for solving
simple stochastic games