324 research outputs found
Multiple-Environment Markov Decision Processes
We introduce Multi-Environment Markov Decision Processes (MEMDPs) which are
MDPs with a set of probabilistic transition functions. The goal in a MEMDP is
to synthesize a single controller with guaranteed performances against all
environments even though the environment is unknown a priori. While MEMDPs can
be seen as a special class of partially observable MDPs, we show that several
verification problems that are undecidable for partially observable MDPs, are
decidable for MEMDPs and sometimes have even efficient solutions
Quantitative games with interval objectives
Traditionally quantitative games such as mean-payoff games and discount sum
games have two players -- one trying to maximize the payoff, the other trying
to minimize it. The associated decision problem, "Can Eve (the maximizer)
achieve, for example, a positive payoff?" can be thought of as one player
trying to attain a payoff in the interval . In this paper we
consider the more general problem of determining if a player can attain a
payoff in a finite union of arbitrary intervals for various payoff functions
(liminf, mean-payoff, discount sum, total sum). In particular this includes the
interesting exact-value problem, "Can Eve achieve a payoff of exactly (e.g.)
0?"Comment: Full version of CONCUR submissio
On the complexity of heterogeneous multidimensional quantitative games
In this paper, we study two-player zero-sum turn-based games played on a
finite multidimensional weighted graph. In recent papers all dimensions use the
same measure, whereas here we allow to combine different measures. Such
heterogeneous multidimensional quantitative games provide a general and natural
model for the study of reactive system synthesis. We focus on classical
measures like the Inf, Sup, LimInf, and LimSup of the weights seen along the
play, as well as on the window mean-payoff (WMP) measure. This new measure is a
natural strengthening of the mean-payoff measure. We allow objectives defined
as Boolean combinations of heterogeneous constraints. While multidimensional
games with Boolean combinations of mean-payoff constraints are undecidable, we
show that the problem becomes EXPTIME-complete for DNF/CNF Boolean combinations
of heterogeneous measures taken among {WMP, Inf, Sup, LimInf, LimSup} and that
exponential memory strategies are sufficient for both players to win. We
provide a detailed study of the complexity and the memory requirements when the
Boolean combination of the measures is replaced by an intersection.
EXPTIME-completeness and exponential memory strategies still hold for the
intersection of measures in {WMP, Inf, Sup, LimInf, LimSup}, and we get
PSPACE-completeness when WMP measure is no longer considered. To avoid
EXPTIME-or PSPACE-hardness, we impose at most one occurrence of WMP measure and
fix the number of Sup measures, and we propose several refinements (on the
number of occurrences of the other measures) for which we get polynomial
algorithms and lower memory requirements. For all the considered classes of
games, we also study parameterized complexity
Assume-Admissible Synthesis
In this paper, we introduce a novel rule for synthesis of reactive systems,
applicable to systems made of n components which have each their own
objectives. It is based on the notion of admissible strategies. We compare our
novel rule with previous rules defined in the literature, and we show that
contrary to the previous proposals, our rule defines sets of solutions which
are rectangular. This property leads to solutions which are robust and
resilient. We provide algorithms with optimal complexity and also an
abstraction framework.Comment: 31 page
Expected Window Mean-Payoff
In the window mean-payoff objective, given an infinite path, instead of
considering a long run average, we consider the minimum payoff that can be
ensured at every position of the path over a finite window that slides over the
entire path. Chatterjee et al. studied the problem to decide if in a two-player
game, Player 1 has a strategy to ensure a window mean-payoff of at least 0.
In this work, we consider a function that given a path returns the supremum
value of the window mean-payoff that can be ensured over the path and we show
how to compute its expected value in Markov chains and Markov decision
processes. We consider two variants of the function: Fixed window mean-payoff
in which a fixed window length is provided; and Bounded window
mean-payoff in which we compute the maximum possible value of the window
mean-payoff over all possible window lengths. Further, for both variants, we
consider (i) a direct version of the problem where for each path, the payoff
that can be ensured from its very beginning and (ii) a non-direct version that
is the prefix independent counterpart of the direct version of the problem.Comment: Replaced PP-hardness of direct fixed window objective with
PSPACE-hardness, added alternative definition of window mean-payof
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes
When treating Markov decision processes (MDPs) with large state spaces, using
explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have
proposed a so-called symblicit algorithm for the synthesis of optimal
strategies in MDPs, in the quantitative setting of expected mean-payoff. This
algorithm, based on the strategy iteration algorithm of Howard and Veinott,
efficiently combines symbolic and explicit data structures, and uses binary
decision diagrams as symbolic representation. The aim of this paper is to show
that the new data structure of pseudo-antichains (an extension of antichains)
provides another interesting alternative, especially for the class of monotonic
MDPs. We design efficient pseudo-antichain based symblicit algorithms (with
open source implementations) for two quantitative settings: the expected
mean-payoff and the stochastic shortest path. For two practical applications
coming from automated planning and LTL synthesis, we report promising
experimental results w.r.t. both the run time and the memory consumption.Comment: In Proceedings SYNT 2014, arXiv:1407.493
- …