Search CORE

573 research outputs found

A pseudo-polynomial algorithm for mean payoff stochastic games with perfect information and few random positions

Author: Borosz Endre
Elbassionix Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue: Oberwolfach : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2015
Field of study

We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V;E), with local rewards r : E Z, and three types of positions: black VB, white VW, and random VR forming a partition of V . It is a long- standing open question whether a polynomial time algorithm for BWR-games exists, or not, even when |VR| = 0. In fact, a pseudo-polynomial algorithm for BWR-games would already imply their polynomial solvability. In this paper, we show that BWR-games with a constant number of random positions can be solved in pseudo-polynomial time. More precisely, in any BWR-game with |VR| = O(1), a saddle point in uniformly optimal pure stationary strategies can be found in time polynomial in |VW| + |VB|, the maximum absolute local reward, and the common denominator of the transition probabilities

Repositorium für Naturwissenschaften und Technik

A Potential Reduction Algorithm for Two-person Zero-sum Mean Payoff Stochastic Games

Author: Boros Endre
Elbassioni Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue
Publication date: 01/01/2015
Field of study

We suggest a new algorithm for two-person zero-sum undiscounted stochastic games focusing on stationary strategies. Given a positive real

\epsilon

, let us call a stochastic game

\epsilon

-ergodic, if its values from any two initial positions differ by at most

\epsilon

. The proposed new algorithm outputs for every

\epsilon>0

in finite time either a pair of stationary strategies for the two players guaranteeing that the values from any initial positions are within an

\epsilon

-range, or identifies two initial positions

u

and

v

and corresponding stationary strategies for the players proving that the game values starting from

u

and

v

are at least

\epsilon/24

apart. In particular, the above result shows that if a stochastic game is

\epsilon

-ergodic, then there are stationary strategies for the players proving

24\epsilon

-ergodicity. This result strengthens and provides a constructive version of an existential result by Vrieze (1980) claiming that if a stochastic game is

0

-ergodic, then there are

\epsilon

-optimal stationary strategies for every

\epsilon > 0

. The suggested algorithm is based on a potential transformation technique that changes the range of local values at all positions without changing the normal form of the game

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

A Nested Family of $k$ -total Effective Rewards for Positional Games

Author: Boros Endre
Elbassioni Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue
Publication date: 01/01/2015
Field of study

We consider Gillette's two-person zero-sum stochastic games with perfect information. For each k \in \ZZ_+ we introduce an effective reward function, called

k

-total. For

k = 0

and

1

this function is known as {\it mean payoff} and {\it total reward}, respectively. We restrict our attention to the deterministic case. For all

k

, we prove the existence of a saddle point which can be realized by uniformly optimal pure stationary strategies. We also demonstrate that

k

-total reward games can be embedded into

(k+1)

-total reward games

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

A Delayed Promotion Policy for Parity Games

Author: Benerecetti Massimo
Dell'Erba Daniele
Mogavero Fabio
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2016
Field of study

Parity games are two-player infinite-duration games on graphs that play a crucial role in various fields of theoretical computer science. Finding efficient algorithms to solve these games in practice is widely acknowledged as a core problem in formal verification, as it leads to efficient solutions of the model-checking and satisfiability problems of expressive temporal logics, e.g., the modal muCalculus. Their solution can be reduced to the problem of identifying sets of positions of the game, called dominions, in each of which a player can force a win by remaining in the set forever. Recently, a novel technique to compute dominions, called priority promotion, has been proposed, which is based on the notions of quasi dominion, a relaxed form of dominion, and dominion space. The underlying framework is general enough to accommodate different instantiations of the solution procedure, whose correctness is ensured by the nature of the space itself. In this paper we propose a new such instantiation, called delayed promotion, that tries to reduce the possible exponential behaviours exhibited by the original method in the worst case. The resulting procedure not only often outperforms the original priority promotion approach, but so far no exponential worst case is known.Comment: In Proceedings GandALF 2016, arXiv:1609.0364

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Directory of Open Access Journals

Oxford University Research Archive

Incentive Stackelberg Mean-payoff Games

Author: Deepak M. S. Krishna
Gupta Anshul
Padarthi Bharath Kumar
Schewe Sven
Trivedi Ashutosh
Publication venue
Publication date: 31/10/2015
Field of study

We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equilibria, there is a distinguished player called the leader who can assign strategies to all other players, referred to as her followers. A strategy profile is a leader strategy profile if no player, except for the leader, can improve his payoff by changing his strategy unilaterally, and a leader equilibrium is a leader strategy profile with a maximal return for the leader. In the proposed case of incentive equilibria, the leader can additionally influence the behaviour of her followers by transferring parts of her payoff to her followers. The ability to incentivise her followers provides the leader with more freedom in selecting strategy profiles, and we show that this can indeed improve the payoff for the leader in such games. The key fundamental result of the paper is the existence of incentive equilibria in mean-payoff games. We further show that the decision problem related to constructing incentive equilibria is NP-complete. On a positive note, we show that, when the number of players is fixed, the complexity of the problem falls in the same class as two-player mean-payoff games. We also present an implementation of the proposed algorithms, and discuss experimental results that demonstrate the feasibility of the analysis of medium sized games.Comment: 15 pages, references, appendix, 5 figure

arXiv.org e-Print Archive

University of Liverpool Repository

A potential reduction algorithm for two-person zero-sum mean payoff stochastic games

Author: Borosz Endre
Elbassionix Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue: Oberwolfach : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2015
Field of study

We suggest a new algorithm for two-person zero-sum undiscounted stochastic games focusing on stationary strategies. Given a positive real , let us call a stochastic game -ergodic, if its values from any two initial positions dier by at most . The proposed new algorithm outputs for every > 0 in nite time either a pair of stationary strategies for the two players guaranteeing that the values from any initial positions are within an -range, or identies two initial positions u and v and corresponding stationary strategies for the players proving that the game values starting from u and v are at least =24 apart. In particular, the above result shows that if a stochastic game is -ergodic, then there are stationary strategies for the players proving 24-ergodicity. This result strengthens and provides a constructive version of an existential result by Vrieze (1980) claiming that if a stochastic game is 0-ergodic, then there are -optimal stationary strategies for every > 0. The suggested algorithm is based on a potential transformation technique that changes the range of local values at all positions without changing the normal form of the game

Repositorium für Naturwissenschaften und Technik

A potential reduction algorithm for two-person zero-sum mean payoff stochastic games

Author: Borosz Endre
Elbassionix Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue: Oberwolfach : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2015
Field of study

Repositorium für Naturwissenschaften und Technik

A potential reduction algorithm for two-person zero-sum mean payoff stochastic games

Author: Borosz Endre
Elbassionix Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue: Oberwolfach : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2015
Field of study

Repositorium für Naturwissenschaften und Technik