62 research outputs found

    Qualitative Analysis of Concurrent Mean-payoff Games

    Get PDF
    We consider concurrent games played by two-players on a finite-state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study a fundamental objective, namely, mean-payoff objective, where a reward is associated to each transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite. The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show that for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies, or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (the value problem for turn-based deterministic mean-payoff games) that is not known to be solvable in polynomial time

    IST Austria Technical Report

    Get PDF
    We consider concurrent games played by two-players on a finite state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study the most fundamental objective for concurrent games, namely, mean-payoff or limit-average objective, where a reward is associated to every transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite (i.e., the games are zero-sum). The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Almost-sure winning with qualitative constraint exactly corresponds to the question whether there exists a strategy to ensure that the payoff is the maximal reward of the game. Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of the algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (of solving the value problem of mean-payoff games) that is not known to be in polynomial time

    Strategy complexity of concurrent safety games

    Get PDF
    We consider two player, zero-sum, finite-state concurrent reachability games, played for an infinite number of rounds, where in every round, each player simultaneously and independently of the other players chooses an action, whereafter the successor state is determined by a probability distribution given by the current state and the chosen actions. Player 1 wins iff a designated goal state is eventually visited. We are interested in the complexity of stationary strategies measured by their patience, which is defined as the inverse of the smallest non-zero probability employed. Our main results are as follows: We show that: (i) the optimal bound on the patience of optimal and -optimal strategies, for both players is doubly exponential; and (ii) even in games with a single non-absorbing state exponential (in the number of actions) patience is necessary

    Exact Algorithms for Solving Stochastic Games

    Full text link
    Shapley's discounted stochastic games, Everett's recursive games and Gillette's undiscounted stochastic games are classical models of game theory describing two-player zero-sum games of potentially infinite duration. We describe algorithms for exactly solving these games

    The Value 1 Problem Under Finite-memory Strategies for Concurrent Mean-payoff Games

    Get PDF
    We consider concurrent mean-payoff games, a very well-studied class of two-player (player 1 vs player 2) zero-sum games on finite-state graphs where every transition is assigned a reward between 0 and 1, and the payoff function is the long-run average of the rewards. The value is the maximal expected payoff that player 1 can guarantee against all strategies of player 2. We consider the computation of the set of states with value 1 under finite-memory strategies for player 1, and our main results for the problem are as follows: (1) we present a polynomial-time algorithm; (2) we show that whenever there is a finite-memory strategy, there is a stationary strategy that does not need memory at all; and (3) we present an optimal bound (which is double exponential) on the patience of stationary strategies (where patience of a distribution is the inverse of the smallest positive probability and represents a complexity measure of a stationary strategy)

    IST Austria Technical Report

    Get PDF
    We study finite-state two-player (zero-sum) concurrent mean-payoff games played on a graph. We focus on the important sub-class of ergodic games where all states are visited infinitely often with probability 1. The algorithmic study of ergodic games was initiated in a seminal work of Hoffman and Karp in 1966, but all basic complexity questions have remained unresolved. Our main results for ergodic games are as follows: We establish (1) an optimal exponential bound on the patience of stationary strategies (where patience of a distribution is the inverse of the smallest positive probability and represents a complexity measure of a stationary strategy); (2) the approximation problem lie in FNP; (3) the approximation problem is at least as hard as the decision problem for simple stochastic games (for which NP and coNP is the long-standing best known bound). We show that the exact value can be expressed in the existential theory of the reals, and also establish square-root sum hardness for a related class of games

    Qualitative analysis of concurrent mean-payoff games.

    Get PDF
    We consider concurrent games played by two players on a finite-state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study the most fundamental objective for concurrent games, namely, mean-payoff or limit-average objective, where a reward is associated to each transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite (i.e., the games are zero-sum). The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Almost-sure winning with qualitative constraint exactly corresponds to the question of whether there exists a strategy to ensure that the payoff is the maximal reward of the game. Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show that for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies, or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of the algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (of solving the value problem of turn-based deterministic mean-payoff games) that is not known to be solvable in polynomial time
    • …
    corecore