910 research outputs found
Deterministic Priority Mean-payoff Games as Limits of Discounted Games
International audienceInspired by the paper of de Alfaro, Henzinger and Majumdar about discounted -calculus we show new surprising links between parity games and different classes of discounted games
Blackwell-Optimal Strategies in Priority Mean-Payoff Games
We examine perfect information stochastic mean-payoff games - a class of
games containing as special sub-classes the usual mean-payoff games and parity
games. We show that deterministic memoryless strategies that are optimal for
discounted games with state-dependent discount factors close to 1 are optimal
for priority mean-payoff games establishing a strong link between these two
classes
Applying Blackwell optimality: priority mean-payoff games as limits of multi-discounted game
International audienceWe define and examine priority mean-payoff games - a natural extension of parity games. By adapting the notion of Blackwell optimality borrowed from the theory of Markov decision processes we show that priority mean-payoff games can be seen as a limit of special multi-discounted games
Two-Player Perfect-Information Shift-Invariant Submixing Stochastic Games Are Half-Positional
We consider zero-sum stochastic games with perfect information and finitely
many states and actions. The payoff is computed by a payoff function which
associates to each infinite sequence of states and actions a real number. We
prove that if the the payoff function is both shift-invariant and submixing,
then the game is half-positional, i.e. the first player has an optimal strategy
which is both deterministic and stationary. This result relies on the existence
of -subgame-perfect equilibria in shift-invariant games, a second
contribution of the paper
Playing in stochastic environment: from multi-armed bandits to two-player games
Given a zero-sum infinite game we examine the question if players have optimal memoryless deterministic strategies. It turns out that under some general conditions the problem for two-player games can be reduced to the same problem for one-player games which in turn can be reduced to a simpler related problem for multi-armed bandits
Perfect Information Stochastic Priority Games
International audienceWe introduce stochastic priority games - a new class of perfect information stochastic games. These games can take two different, but equivalent, forms. In stopping priority games a play can be stopped by the environment after a finite number of stages, however, infinite plays are also possible. In discounted priority games only infinite plays are possible and the payoff is a linear combination of the classical discount payoff and of a limit payoff evaluating the performance at infinity. Shapley games and parity games are special extreme cases of priority games
Continuous positional payoffs
What payoffs are positionally determined for deterministic two-player antagonistic games on finite directed graphs? In this paper we study this question for payoffs that are continuous. The main reason why continuous positionally determined payoffs are interesting is that they include the multi-discounted payoffs.
We show that for continuous payoffs positional determinacy is equivalent to a simple property called prefix-monotonicity. We provide three proofs of it, using three major techniques of establishing positional determinacy - inductive technique, fixed point technique and strategy improvement technique. A combination of these approaches provides us with better understanding of the structure of continuous positionally determined payoffs as well as with some algorithmic results
Markov Decision Processes with Multiple Long-run Average Objectives
We study Markov decision processes (MDPs) with multiple limit-average (or
mean-payoff) functions. We consider two different objectives, namely,
expectation and satisfaction objectives. Given an MDP with k limit-average
functions, in the expectation objective the goal is to maximize the expected
limit-average value, and in the satisfaction objective the goal is to maximize
the probability of runs such that the limit-average value stays above a given
vector. We show that under the expectation objective, in contrast to the case
of one limit-average function, both randomization and memory are necessary for
strategies even for epsilon-approximation, and that finite-memory randomized
strategies are sufficient for achieving Pareto optimal values. Under the
satisfaction objective, in contrast to the case of one limit-average function,
infinite memory is necessary for strategies achieving a specific value (i.e.
randomized finite-memory strategies are not sufficient), whereas memoryless
randomized strategies are sufficient for epsilon-approximation, for all
epsilon>0. We further prove that the decision problems for both expectation and
satisfaction objectives can be solved in polynomial time and the trade-off
curve (Pareto curve) can be epsilon-approximated in time polynomial in the size
of the MDP and 1/epsilon, and exponential in the number of limit-average
functions, for all epsilon>0. Our analysis also reveals flaws in previous work
for MDPs with multiple mean-payoff functions under the expectation objective,
corrects the flaws, and allows us to obtain improved results
- …