586 research outputs found
Recommended from our members
A pseudo-polynomial algorithm for mean payoff stochastic games with perfect information and few random positions
We consider two-person zero-sum stochastic mean payoff games with perfect information,
or BWR-games, given by a digraph G = (V;E), with local rewards r : E Z, and three
types of positions: black VB, white VW, and random VR forming a partition of V . It is a long-
standing open question whether a polynomial time algorithm for BWR-games exists, or not,
even when |VR| = 0. In fact, a pseudo-polynomial algorithm for BWR-games would already
imply their polynomial solvability. In this paper, we show that BWR-games with a constant
number of random positions can be solved in pseudo-polynomial time. More precisely, in any
BWR-game with |VR| = O(1), a saddle point in uniformly optimal pure stationary strategies
can be found in time polynomial in |VW| + |VB|, the maximum absolute local reward, and the
common denominator of the transition probabilities
A Potential Reduction Algorithm for Two-person Zero-sum Mean Payoff Stochastic Games
We suggest a new algorithm for two-person zero-sum undiscounted stochastic
games focusing on stationary strategies. Given a positive real , let
us call a stochastic game -ergodic, if its values from any two
initial positions differ by at most . The proposed new algorithm
outputs for every in finite time either a pair of stationary
strategies for the two players guaranteeing that the values from any initial
positions are within an -range, or identifies two initial positions
and and corresponding stationary strategies for the players proving
that the game values starting from and are at least
apart. In particular, the above result shows that if a stochastic game is
-ergodic, then there are stationary strategies for the players
proving -ergodicity. This result strengthens and provides a
constructive version of an existential result by Vrieze (1980) claiming that if
a stochastic game is -ergodic, then there are -optimal stationary
strategies for every . The suggested algorithm is based on a
potential transformation technique that changes the range of local values at
all positions without changing the normal form of the game
A Nested Family of -total Effective Rewards for Positional Games
We consider Gillette's two-person zero-sum stochastic games with perfect
information. For each k \in \ZZ_+ we introduce an effective reward function,
called -total. For and this function is known as {\it mean
payoff} and {\it total reward}, respectively. We restrict our attention to the
deterministic case. For all , we prove the existence of a saddle point which
can be realized by uniformly optimal pure stationary strategies. We also
demonstrate that -total reward games can be embedded into -total
reward games
Smoothed analysis of deterministic discounted and mean-payoff games
We devise a policy-iteration algorithm for deterministic two-player
discounted and mean-payoff games, that runs in polynomial time with high
probability, on any input where each payoff is chosen independently from a
sufficiently random distribution.
This includes the case where an arbitrary set of payoffs has been perturbed
by a Gaussian, showing for the first time that deterministic two-player games
can be solved efficiently, in the sense of smoothed analysis.
More generally, we devise a condition number for deterministic discounted and
mean-payoff games, and show that our algorithm runs in time polynomial in this
condition number.
Our result confirms a previous conjecture of Boros et al., which was claimed
as a theorem and later retracted. It stands in contrast with a recent
counter-example by Christ and Yannakakis, showing that Howard's
policy-iteration algorithm does not run in smoothed polynomial time on
stochastic single-player mean-payoff games.
Our approach is inspired by the analysis of random optimal assignment
instances by Frieze and Sorkin, and the analysis of bias-induced policies for
mean-payoff games by Akian, Gaubert and Hochart
A Delayed Promotion Policy for Parity Games
Parity games are two-player infinite-duration games on graphs that play a
crucial role in various fields of theoretical computer science. Finding
efficient algorithms to solve these games in practice is widely acknowledged as
a core problem in formal verification, as it leads to efficient solutions of
the model-checking and satisfiability problems of expressive temporal logics,
e.g., the modal muCalculus. Their solution can be reduced to the problem of
identifying sets of positions of the game, called dominions, in each of which a
player can force a win by remaining in the set forever. Recently, a novel
technique to compute dominions, called priority promotion, has been proposed,
which is based on the notions of quasi dominion, a relaxed form of dominion,
and dominion space. The underlying framework is general enough to accommodate
different instantiations of the solution procedure, whose correctness is
ensured by the nature of the space itself. In this paper we propose a new such
instantiation, called delayed promotion, that tries to reduce the possible
exponential behaviours exhibited by the original method in the worst case. The
resulting procedure not only often outperforms the original priority promotion
approach, but so far no exponential worst case is known.Comment: In Proceedings GandALF 2016, arXiv:1609.0364
Incentive Stackelberg Mean-payoff Games
We introduce and study incentive equilibria for multi-player meanpayoff
games. Incentive equilibria generalise well-studied solution concepts such as
Nash equilibria and leader equilibria (also known as Stackelberg equilibria).
Recall that a strategy profile is a Nash equilibrium if no player can improve
his payoff by changing his strategy unilaterally. In the setting of incentive
and leader equilibria, there is a distinguished player called the leader who
can assign strategies to all other players, referred to as her followers. A
strategy profile is a leader strategy profile if no player, except for the
leader, can improve his payoff by changing his strategy unilaterally, and a
leader equilibrium is a leader strategy profile with a maximal return for the
leader. In the proposed case of incentive equilibria, the leader can
additionally influence the behaviour of her followers by transferring parts of
her payoff to her followers. The ability to incentivise her followers provides
the leader with more freedom in selecting strategy profiles, and we show that
this can indeed improve the payoff for the leader in such games. The key
fundamental result of the paper is the existence of incentive equilibria in
mean-payoff games. We further show that the decision problem related to
constructing incentive equilibria is NP-complete. On a positive note, we show
that, when the number of players is fixed, the complexity of the problem falls
in the same class as two-player mean-payoff games. We also present an
implementation of the proposed algorithms, and discuss experimental results
that demonstrate the feasibility of the analysis of medium sized games.Comment: 15 pages, references, appendix, 5 figure
Recommended from our members
A potential reduction algorithm for two-person zero-sum mean payoff stochastic games
We suggest a new algorithm for two-person zero-sum undiscounted
stochastic games focusing on stationary strategies. Given a positive real
, let us call a stochastic game -ergodic, if its values from any two initial
positions dier by at most . The proposed new algorithm outputs for
every > 0 in nite time either a pair of stationary strategies for the two
players guaranteeing that the values from any initial positions are within
an -range, or identies two initial positions u and v and corresponding
stationary strategies for the players proving that the game values starting
from u and v are at least =24 apart. In particular, the above result
shows that if a stochastic game is -ergodic, then there are stationary
strategies for the players proving 24-ergodicity. This result strengthens
and provides a constructive version of an existential result by Vrieze (1980)
claiming that if a stochastic game is 0-ergodic, then there are -optimal
stationary strategies for every > 0. The suggested algorithm is based
on a potential transformation technique that changes the range of local
values at all positions without changing the normal form of the game
Recommended from our members
A potential reduction algorithm for two-person zero-sum mean payoff stochastic games
We suggest a new algorithm for two-person zero-sum undiscounted
stochastic games focusing on stationary strategies. Given a positive real
, let us call a stochastic game -ergodic, if its values from any two initial
positions dier by at most . The proposed new algorithm outputs for
every > 0 in nite time either a pair of stationary strategies for the two
players guaranteeing that the values from any initial positions are within
an -range, or identies two initial positions u and v and corresponding
stationary strategies for the players proving that the game values starting
from u and v are at least =24 apart. In particular, the above result
shows that if a stochastic game is -ergodic, then there are stationary
strategies for the players proving 24-ergodicity. This result strengthens
and provides a constructive version of an existential result by Vrieze (1980)
claiming that if a stochastic game is 0-ergodic, then there are -optimal
stationary strategies for every > 0. The suggested algorithm is based
on a potential transformation technique that changes the range of local
values at all positions without changing the normal form of the game
- …