29 research outputs found
One-Counter Stochastic Games
We study the computational complexity of basic decision problems for
one-counter simple stochastic games (OC-SSGs), under various objectives.
OC-SSGs are 2-player turn-based stochastic games played on the transition graph
of classic one-counter automata. We study primarily the termination objective,
where the goal of one player is to maximize the probability of reaching counter
value 0, while the other player wishes to avoid this. Partly motivated by the
goal of understanding termination objectives, we also study certain "limit" and
"long run average" reward objectives that are closely related to some
well-studied objectives for stochastic games with rewards. Examples of problems
we address include: does player 1 have a strategy to ensure that the counter
eventually hits 0, i.e., terminates, almost surely, regardless of what player 2
does? Or that the liminf (or limsup) counter value equals infinity with a
desired probability? Or that the long run average reward is >0 with desired
probability? We show that the qualitative termination problem for OC-SSGs is in
NP intersection coNP, and is in P-time for 1-player OC-SSGs, or equivalently
for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that
quantitative limit problems for OC-SSGs are in NP intersection coNP, and are in
P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative
termination problems for OC-SSGs are already at least as hard as Condon's
quantitative decision problem for finite-state SSGs.Comment: 20 pages, 1 figure. This is a full version of a paper accepted for
publication in proceedings of FSTTCS 201
One-Counter Stochastic Games
We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stochastic games played on the transition graph of classic one-counter automata. We study primarily the termination objective, where the goal of one player is to maximize the probability of reaching counter value 0, while the other player wishes to avoid this. Partly motivated by the goal of understanding termination objectives, we also study certain ``limit\u27\u27 and ``long run average\u27\u27 reward objectives that are closely related to some well-studied objectives for stochastic games with rewards. Examples of problems we address include: does player 1 have a
strategy to ensure that the counter eventually hits 0, i.e., terminates, almost surely, regardless of what player 2 does? Or that the (or ) counter value equals with a desired
probability? Or that the long run average reward is with desired probability? We show that the qualitative termination problem
for OC-SSGs is in intersect , and is in P-time for 1-player OC-SSGs, or equivalently for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that quantitative limit problems for OC-SSGs are in intersect , and are in P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative termination problems for OC-SSGs are already at least as hard as Condon\u27s quantitative decision problem for finite-state SSGs
Optimal Strategies in Infinite-state Stochastic Reachability Games
We consider perfect-information reachability stochastic games for 2 players
on infinite graphs. We identify a subclass of such games, and prove two
interesting properties of it: first, Player Max always has optimal strategies
in games from this subclass, and second, these games are strongly determined.
The subclass is defined by the property that the set of all values can only
have one accumulation point -- 0. Our results nicely mirror recent results for
finitely-branching games, where, on the contrary, Player Min always has optimal
strategies. However, our proof methods are substantially different, because the
roles of the players are not symmetric. We also do not restrict the branching
of the games. Finally, we apply our results in the context of recently studied
One-Counter stochastic games
Minimizing Running Costs in Consumption Systems
A standard approach to optimizing long-run running costs of discrete systems
is based on minimizing the mean-payoff, i.e., the long-run average amount of
resources ("energy") consumed per transition. However, this approach inherently
assumes that the energy source has an unbounded capacity, which is not always
realistic. For example, an autonomous robotic device has a battery of finite
capacity that has to be recharged periodically, and the total amount of energy
consumed between two successive charging cycles is bounded by the capacity.
Hence, a controller minimizing the mean-payoff must obey this restriction. In
this paper we study the controller synthesis problem for consumption systems
with a finite battery capacity, where the task of the controller is to minimize
the mean-payoff while preserving the functionality of the system encoded by a
given linear-time property. We show that an optimal controller always exists,
and it may either need only finite memory or require infinite memory (it is
decidable in polynomial time which of the two cases holds). Further, we show
how to compute an effective description of an optimal controller in polynomial
time. Finally, we consider the limit values achievable by larger and larger
battery capacity, show that these values are computable in polynomial time, and
we also analyze the corresponding rate of convergence. To the best of our
knowledge, these are the first results about optimizing the long-run running
costs in systems with bounded energy stores.Comment: 32 pages, corrections of typos and minor omission
Two-Player Perfect-Information Shift-Invariant Submixing Stochastic Games Are Half-Positional
We consider zero-sum stochastic games with perfect information and finitely
many states and actions. The payoff is computed by a payoff function which
associates to each infinite sequence of states and actions a real number. We
prove that if the the payoff function is both shift-invariant and submixing,
then the game is half-positional, i.e. the first player has an optimal strategy
which is both deterministic and stationary. This result relies on the existence
of -subgame-perfect equilibria in shift-invariant games, a second
contribution of the paper
Undecidability of Two-dimensional Robot Games
Robot game is a two-player vector addition game played on the integer lattice
. Both players have sets of vectors and in each turn the vector
chosen by a player is added to the current configuration vector of the game.
One of the players, called Eve, tries to play the game from the initial
configuration to the origin while the other player, Adam, tries to avoid the
origin. The problem is to decide whether or not Eve has a winning strategy. In
this paper we prove undecidability of the robot game in dimension two answering
the question formulated by Doyen and Rabinovich in 2011 and closing the gap
between undecidable and decidable cases
Solving Stochastic B\"uchi Games on Infinite Arenas with a Finite Attractor
We consider games played on an infinite probabilistic arena where the first
player aims at satisfying generalized B\"uchi objectives almost surely, i.e.,
with probability one. We provide a fixpoint characterization of the winning
sets and associated winning strategies in the case where the arena satisfies
the finite-attractor property. From this we directly deduce the decidability of
these games on probabilistic lossy channel systems.Comment: In Proceedings QAPL 2013, arXiv:1306.241
Markov Decision Processes with Multiple Long-run Average Objectives
We study Markov decision processes (MDPs) with multiple limit-average (or
mean-payoff) functions. We consider two different objectives, namely,
expectation and satisfaction objectives. Given an MDP with k limit-average
functions, in the expectation objective the goal is to maximize the expected
limit-average value, and in the satisfaction objective the goal is to maximize
the probability of runs such that the limit-average value stays above a given
vector. We show that under the expectation objective, in contrast to the case
of one limit-average function, both randomization and memory are necessary for
strategies even for epsilon-approximation, and that finite-memory randomized
strategies are sufficient for achieving Pareto optimal values. Under the
satisfaction objective, in contrast to the case of one limit-average function,
infinite memory is necessary for strategies achieving a specific value (i.e.
randomized finite-memory strategies are not sufficient), whereas memoryless
randomized strategies are sufficient for epsilon-approximation, for all
epsilon>0. We further prove that the decision problems for both expectation and
satisfaction objectives can be solved in polynomial time and the trade-off
curve (Pareto curve) can be epsilon-approximated in time polynomial in the size
of the MDP and 1/epsilon, and exponential in the number of limit-average
functions, for all epsilon>0. Our analysis also reveals flaws in previous work
for MDPs with multiple mean-payoff functions under the expectation objective,
corrects the flaws, and allows us to obtain improved results