2,612 research outputs found
How do we remember the past in randomised strategies?
Graph games of infinite length are a natural model for open reactive
processes: one player represents the controller, trying to ensure a given
specification, and the other represents a hostile environment. The evolution of
the system depends on the decisions of both players, supplemented by chance.
In this work, we focus on the notion of randomised strategy. More
specifically, we show that three natural definitions may lead to very different
results: in the most general cases, an almost-surely winning situation may
become almost-surely losing if the player is only allowed to use a weaker
notion of strategy. In more reasonable settings, translations exist, but they
require infinite memory, even in simple cases. Finally, some traditional
problems becomes undecidable for the strongest type of strategies
Qualitative Analysis of Partially-observable Markov Decision Processes
We study observation-based strategies for partially-observable Markov
decision processes (POMDPs) with omega-regular objectives. An observation-based
strategy relies on partial information about the history of a play, namely, on
the past sequence of observations. We consider the qualitative analysis
problem: given a POMDP with an omega-regular objective, whether there is an
observation-based strategy to achieve the objective with probability~1
(almost-sure winning), or with positive probability (positive winning). Our
main results are twofold. First, we present a complete picture of the
computational complexity of the qualitative analysis of POMDP s with parity
objectives (a canonical form to express omega-regular objectives) and its
subclasses. Our contribution consists in establishing several upper and lower
bounds that were not known in literature. Second, we present optimal bounds
(matching upper and lower bounds) on the memory required by pure and randomized
observation-based strategies for the qualitative analysis of POMDP s with
parity objectives and its subclasses
Probabilistic modal {\mu}-calculus with independent product
The probabilistic modal {\mu}-calculus is a fixed-point logic designed for
expressing properties of probabilistic labeled transition systems (PLTS's). Two
equivalent semantics have been studied for this logic, both assigning to each
state a value in the interval [0,1] representing the probability that the
property expressed by the formula holds at the state. One semantics is
denotational and the other is a game semantics, specified in terms of
two-player stochastic parity games. A shortcoming of the probabilistic modal
{\mu}-calculus is the lack of expressiveness required to encode other important
temporal logics for PLTS's such as Probabilistic Computation Tree Logic (PCTL).
To address this limitation we extend the logic with a new pair of operators:
independent product and coproduct. The resulting logic, called probabilistic
modal {\mu}-calculus with independent product, can encode many properties of
interest and subsumes the qualitative fragment of PCTL. The main contribution
of this paper is the definition of an appropriate game semantics for this
extended probabilistic {\mu}-calculus. This relies on the definition of a new
class of games which generalize standard two-player stochastic (parity) games
by allowing a play to be split into concurrent subplays, each continuing their
evolution independently. Our main technical result is the equivalence of the
two semantics. The proof is carried out in ZFC set theory extended with
Martin's Axiom at an uncountable cardinal
Recursive Concurrent Stochastic Games
We study Recursive Concurrent Stochastic Games (RCSGs), extending our recent
analysis of recursive simple stochastic games to a concurrent setting where the
two players choose moves simultaneously and independently at each state. For
multi-exit games, our earlier work already showed undecidability for basic
questions like termination, thus we focus on the important case of single-exit
RCSGs (1-RCSGs).
We first characterize the value of a 1-RCSG termination game as the least
fixed point solution of a system of nonlinear minimax functional equations, and
use it to show PSPACE decidability for the quantitative termination problem. We
then give a strategy improvement technique, which we use to show that player 1
(maximizer) has \epsilon-optimal randomized Stackless & Memoryless (r-SM)
strategies for all \epsilon > 0, while player 2 (minimizer) has optimal r-SM
strategies. Thus, such games are r-SM-determined. These results mirror and
generalize in a strong sense the randomized memoryless determinacy results for
finite stochastic games, and extend the classic Hoffman-Karp strategy
improvement approach from the finite to an infinite state setting. The proofs
in our infinite-state setting are very different however, relying on subtle
analytic properties of certain power series that arise from studying 1-RCSGs.
We show that our upper bounds, even for qualitative (probability 1)
termination, can not be improved, even to NP, without a major breakthrough, by
giving two reductions: first a P-time reduction from the long-standing
square-root sum problem to the quantitative termination decision problem for
finite concurrent stochastic games, and then a P-time reduction from the latter
problem to the qualitative termination problem for 1-RCSGs.Comment: 21 pages, 2 figure
Games on graphs with a public signal monitoring
We study pure Nash equilibria in games on graphs with an imperfect monitoring
based on a public signal. In such games, deviations and players responsible for
those deviations can be hard to detect and track. We propose a generic
epistemic game abstraction, which conveniently allows to represent the
knowledge of the players about these deviations, and give a characterization of
Nash equilibria in terms of winning strategies in the abstraction. We then use
the abstraction to develop algorithms for some payoff functions.Comment: 28 page
Reachability analysis of branching probabilistic processes
We study a fundamental class of infinite-state stochastic processes and stochastic
games, namely Branching Processes, under the properties of (single-target) reachability
and multi-objective reachability.
In particular, we study Branching Concurrent Stochastic Games (BCSGs), which
are an imperfect-information game extension to the classical Branching Processes, and
show that these games are determined, i.e., have a value, under the fundamental objective
of reachability, building on and generalizing prior work on Branching Simple
Stochastic Games and finite-state Concurrent Stochastic Games. We show that, unlike
in the turn-based branching games, in the concurrent setting the almost-sure and limitsure
reachability problems do not coincide and we give polynomial time algorithms
for deciding both almost-sure and limit-sure reachability. We also provide a discussion
on the complexity of quantitative reachability questions for BCSGs.
Furthermore, we introduce a new model, namely Ordered Branching Processes
(OBPs), which is a hybrid model between classical Branching Processes and Stochastic
Context-Free Grammars. Under the reachability objective, this model is equivalent
to the classical Branching Processes. We study qualitative multi-objective reachability
questions for Ordered Branching Markov Decision Processes (OBMDPs), or equivalently
context-free MDPs with simultaneous derivation. We provide algorithmic results
for efficiently checking certain Boolean combinations of qualitative reachability
and non-reachability queries with respect to different given target non-terminals.
Among the more interesting multi-objective reachability results, we provide two
separate algorithms for almost-sure and limit-sure multi-target reachability for OBMDPs.
Specifically, given an OBMDP, given a starting non-terminal, and given a set
of target non-terminals, our first algorithm decides whether the supremum probability,
of generating a tree that contains every target non-terminal in the set, is 1. Our second
algorithm decides whether there is a strategy for the player to almost-surely (with
probability 1) generate a tree that contains every target non-terminal in the set. The
two separate algorithms are needed: we show that indeed, in this context, almost-sure
and limit-sure multi-target reachability do not coincide. Both algorithms run in time
polynomial in the size of the OBMDP and exponential in the number of targets. Hence,
they run in polynomial time when the number of targets is fixed. The algorithms are
fixed-parameter tractable with respect to this number. Moreover, we show that the
qualitative almost-sure (and limit-sure) multi-target reachability decision problem is in
general NP-hard, when the size of the set of target non-terminals is not fixed
Emptiness Of Alternating Tree Automata Using Games With Imperfect Information
We consider the emptiness problem for alternating tree automata,
with two acceptance semantics: classical (all branches are accepted)
and qualitative (almost all branches are accepted). For the classical semantics, the usual technique to tackle this problem relies on a Simulation Theorem which constructs an equivalent non-deterministic automaton from the original alternating one, and then checks emptiness by a reduction to a two-player perfect information game.
However, for the qualitative semantics, no simulation of alternation by means of non-determinism is known.
We give an alternative technique to decide the emptiness problem of alternating tree automata, that does not rely on a Simulation Theorem.
Indeed, we directly reduce the emptiness problem to solving an imperfect information two-player parity game. Our new approach can successfully be applied to both semantics, and yields decidability results with optimal complexity; for the qualitative semantics, the key ingredient in the proof is a positionality result for stochastic games played over infinite graphs
Distributed stochastic optimization via matrix exponential learning
In this paper, we investigate a distributed learning scheme for a broad class
of stochastic optimization problems and games that arise in signal processing
and wireless communications. The proposed algorithm relies on the method of
matrix exponential learning (MXL) and only requires locally computable gradient
observations that are possibly imperfect and/or obsolete. To analyze it, we
introduce the notion of a stable Nash equilibrium and we show that the
algorithm is globally convergent to such equilibria - or locally convergent
when an equilibrium is only locally stable. We also derive an explicit linear
bound for the algorithm's convergence speed, which remains valid under
measurement errors and uncertainty of arbitrarily high variance. To validate
our theoretical analysis, we test the algorithm in realistic
multi-carrier/multiple-antenna wireless scenarios where several users seek to
maximize their energy efficiency. Our results show that learning allows users
to attain a net increase between 100% and 500% in energy efficiency, even under
very high uncertainty.Comment: 31 pages, 3 figure
How Good Is a Strategy in a Game with Nature?
International audienceWe consider games with two antagonistic players — Éloïse (modelling a program) and Abélard (modelling a byzantine environment) — and a third, unpredictable and uncontrollable player, that we call Nature. Motivated by the fact that the usual probabilistic semantics very quickly leads to undecidability when considering either infinite game graphs or imperfect information, we propose two alternative semantics that leads to decidability where the probabilistic one fails: one based on counting and one based on topology
- …