586 research outputs found

    A Potential Reduction Algorithm for Two-person Zero-sum Mean Payoff Stochastic Games

    Get PDF
    We suggest a new algorithm for two-person zero-sum undiscounted stochastic games focusing on stationary strategies. Given a positive real ϵ\epsilon, let us call a stochastic game ϵ\epsilon-ergodic, if its values from any two initial positions differ by at most ϵ\epsilon. The proposed new algorithm outputs for every ϵ>0\epsilon>0 in finite time either a pair of stationary strategies for the two players guaranteeing that the values from any initial positions are within an ϵ\epsilon-range, or identifies two initial positions uu and vv and corresponding stationary strategies for the players proving that the game values starting from uu and vv are at least ϵ/24\epsilon/24 apart. In particular, the above result shows that if a stochastic game is ϵ\epsilon-ergodic, then there are stationary strategies for the players proving 24ϵ24\epsilon-ergodicity. This result strengthens and provides a constructive version of an existential result by Vrieze (1980) claiming that if a stochastic game is 00-ergodic, then there are ϵ\epsilon-optimal stationary strategies for every ϵ>0\epsilon > 0. The suggested algorithm is based on a potential transformation technique that changes the range of local values at all positions without changing the normal form of the game

    A Nested Family of kk-total Effective Rewards for Positional Games

    Get PDF
    We consider Gillette's two-person zero-sum stochastic games with perfect information. For each k \in \ZZ_+ we introduce an effective reward function, called kk-total. For k=0k = 0 and 11 this function is known as {\it mean payoff} and {\it total reward}, respectively. We restrict our attention to the deterministic case. For all kk, we prove the existence of a saddle point which can be realized by uniformly optimal pure stationary strategies. We also demonstrate that kk-total reward games can be embedded into (k+1)(k+1)-total reward games

    Smoothed analysis of deterministic discounted and mean-payoff games

    Full text link
    We devise a policy-iteration algorithm for deterministic two-player discounted and mean-payoff games, that runs in polynomial time with high probability, on any input where each payoff is chosen independently from a sufficiently random distribution. This includes the case where an arbitrary set of payoffs has been perturbed by a Gaussian, showing for the first time that deterministic two-player games can be solved efficiently, in the sense of smoothed analysis. More generally, we devise a condition number for deterministic discounted and mean-payoff games, and show that our algorithm runs in time polynomial in this condition number. Our result confirms a previous conjecture of Boros et al., which was claimed as a theorem and later retracted. It stands in contrast with a recent counter-example by Christ and Yannakakis, showing that Howard's policy-iteration algorithm does not run in smoothed polynomial time on stochastic single-player mean-payoff games. Our approach is inspired by the analysis of random optimal assignment instances by Frieze and Sorkin, and the analysis of bias-induced policies for mean-payoff games by Akian, Gaubert and Hochart

    A Delayed Promotion Policy for Parity Games

    Full text link
    Parity games are two-player infinite-duration games on graphs that play a crucial role in various fields of theoretical computer science. Finding efficient algorithms to solve these games in practice is widely acknowledged as a core problem in formal verification, as it leads to efficient solutions of the model-checking and satisfiability problems of expressive temporal logics, e.g., the modal muCalculus. Their solution can be reduced to the problem of identifying sets of positions of the game, called dominions, in each of which a player can force a win by remaining in the set forever. Recently, a novel technique to compute dominions, called priority promotion, has been proposed, which is based on the notions of quasi dominion, a relaxed form of dominion, and dominion space. The underlying framework is general enough to accommodate different instantiations of the solution procedure, whose correctness is ensured by the nature of the space itself. In this paper we propose a new such instantiation, called delayed promotion, that tries to reduce the possible exponential behaviours exhibited by the original method in the worst case. The resulting procedure not only often outperforms the original priority promotion approach, but so far no exponential worst case is known.Comment: In Proceedings GandALF 2016, arXiv:1609.0364

    Incentive Stackelberg Mean-payoff Games

    Get PDF
    We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equilibria, there is a distinguished player called the leader who can assign strategies to all other players, referred to as her followers. A strategy profile is a leader strategy profile if no player, except for the leader, can improve his payoff by changing his strategy unilaterally, and a leader equilibrium is a leader strategy profile with a maximal return for the leader. In the proposed case of incentive equilibria, the leader can additionally influence the behaviour of her followers by transferring parts of her payoff to her followers. The ability to incentivise her followers provides the leader with more freedom in selecting strategy profiles, and we show that this can indeed improve the payoff for the leader in such games. The key fundamental result of the paper is the existence of incentive equilibria in mean-payoff games. We further show that the decision problem related to constructing incentive equilibria is NP-complete. On a positive note, we show that, when the number of players is fixed, the complexity of the problem falls in the same class as two-player mean-payoff games. We also present an implementation of the proposed algorithms, and discuss experimental results that demonstrate the feasibility of the analysis of medium sized games.Comment: 15 pages, references, appendix, 5 figure
    corecore