2,354 research outputs found
Magnifying Lens Abstraction for Stochastic Games with Discounted and Long-run Average Objectives
Turn-based stochastic games and its important subclass Markov decision
processes (MDPs) provide models for systems with both probabilistic and
nondeterministic behaviors. We consider turn-based stochastic games with two
classical quantitative objectives: discounted-sum and long-run average
objectives. The game models and the quantitative objectives are widely used in
probabilistic verification, planning, optimal inventory control, network
protocol and performance analysis. Games and MDPs that model realistic systems
often have very large state spaces, and probabilistic abstraction techniques
are necessary to handle the state-space explosion. The commonly used
full-abstraction techniques do not yield space-savings for systems that have
many states with similar value, but does not necessarily have similar
transition structure. A semi-abstraction technique, namely Magnifying-lens
abstractions (MLA), that clusters states based on value only, disregarding
differences in their transition relation was proposed for qualitative
objectives (reachability and safety objectives). In this paper we extend the
MLA technique to solve stochastic games with discounted-sum and long-run
average objectives. We present the MLA technique based abstraction-refinement
algorithm for stochastic games and MDPs with discounted-sum objectives. For
long-run average objectives, our solution works for all MDPs and a sub-class of
stochastic games where every state has the same value
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
Non-Zero Sum Games for Reactive Synthesis
In this invited contribution, we summarize new solution concepts useful for
the synthesis of reactive systems that we have introduced in several recent
publications. These solution concepts are developed in the context of non-zero
sum games played on graphs. They are part of the contributions obtained in the
inVEST project funded by the European Research Council.Comment: LATA'16 invited pape
Expectations or Guarantees? I Want It All! A crossroad between games and MDPs
When reasoning about the strategic capabilities of an agent, it is important
to consider the nature of its adversaries. In the particular context of
controller synthesis for quantitative specifications, the usual problem is to
devise a strategy for a reactive system which yields some desired performance,
taking into account the possible impact of the environment of the system. There
are at least two ways to look at this environment. In the classical analysis of
two-player quantitative games, the environment is purely antagonistic and the
problem is to provide strict performance guarantees. In Markov decision
processes, the environment is seen as purely stochastic: the aim is then to
optimize the expected payoff, with no guarantee on individual outcomes.
In this expository work, we report on recent results introducing the beyond
worst-case synthesis problem, which is to construct strategies that guarantee
some quantitative requirement in the worst-case while providing an higher
expected value against a particular stochastic model of the environment given
as input. This problem is relevant to produce system controllers that provide
nice expected performance in the everyday situation while ensuring a strict
(but relaxed) performance threshold even in the event of very bad (while
unlikely) circumstances. It has been studied for both the mean-payoff and the
shortest path quantitative measures.Comment: In Proceedings SR 2014, arXiv:1404.041
Qualitative Analysis of Partially-observable Markov Decision Processes
We study observation-based strategies for partially-observable Markov
decision processes (POMDPs) with omega-regular objectives. An observation-based
strategy relies on partial information about the history of a play, namely, on
the past sequence of observations. We consider the qualitative analysis
problem: given a POMDP with an omega-regular objective, whether there is an
observation-based strategy to achieve the objective with probability~1
(almost-sure winning), or with positive probability (positive winning). Our
main results are twofold. First, we present a complete picture of the
computational complexity of the qualitative analysis of POMDP s with parity
objectives (a canonical form to express omega-regular objectives) and its
subclasses. Our contribution consists in establishing several upper and lower
bounds that were not known in literature. Second, we present optimal bounds
(matching upper and lower bounds) on the memory required by pure and randomized
observation-based strategies for the qualitative analysis of POMDP s with
parity objectives and its subclasses
One-Counter Stochastic Games
We study the computational complexity of basic decision problems for
one-counter simple stochastic games (OC-SSGs), under various objectives.
OC-SSGs are 2-player turn-based stochastic games played on the transition graph
of classic one-counter automata. We study primarily the termination objective,
where the goal of one player is to maximize the probability of reaching counter
value 0, while the other player wishes to avoid this. Partly motivated by the
goal of understanding termination objectives, we also study certain "limit" and
"long run average" reward objectives that are closely related to some
well-studied objectives for stochastic games with rewards. Examples of problems
we address include: does player 1 have a strategy to ensure that the counter
eventually hits 0, i.e., terminates, almost surely, regardless of what player 2
does? Or that the liminf (or limsup) counter value equals infinity with a
desired probability? Or that the long run average reward is >0 with desired
probability? We show that the qualitative termination problem for OC-SSGs is in
NP intersection coNP, and is in P-time for 1-player OC-SSGs, or equivalently
for one-counter Markov Decision Processes (OC-MDPs). Moreover, we show that
quantitative limit problems for OC-SSGs are in NP intersection coNP, and are in
P-time for 1-player OC-MDPs. Both qualitative limit problems and qualitative
termination problems for OC-SSGs are already at least as hard as Condon's
quantitative decision problem for finite-state SSGs.Comment: 20 pages, 1 figure. This is a full version of a paper accepted for
publication in proceedings of FSTTCS 201
Minimizing Expected Cost Under Hard Boolean Constraints, with Applications to Quantitative Synthesis
In Boolean synthesis, we are given an LTL specification, and the goal is to
construct a transducer that realizes it against an adversarial environment.
Often, a specification contains both Boolean requirements that should be
satisfied against an adversarial environment, and multi-valued components that
refer to the quality of the satisfaction and whose expected cost we would like
to minimize with respect to a probabilistic environment.
In this work we study, for the first time, mean-payoff games in which the
system aims at minimizing the expected cost against a probabilistic
environment, while surely satisfying an -regular condition against an
adversarial environment. We consider the case the -regular condition is
given as a parity objective or by an LTL formula. We show that in general,
optimal strategies need not exist, and moreover, the limit value cannot be
approximated by finite-memory strategies. We thus focus on computing the
limit-value, and give tight complexity bounds for synthesizing
-optimal strategies for both finite-memory and infinite-memory
strategies.
We show that our game naturally arises in various contexts of synthesis with
Boolean and multi-valued objectives. Beyond direct applications, in synthesis
with costs and rewards to certain behaviors, it allows us to compute the
minimal sensing cost of -regular specifications -- a measure of quality
in which we look for a transducer that minimizes the expected number of signals
that are read from the input
- …