Search CORE

324 research outputs found

Multiple-Environment Markov Decision Processes

Author: Raskin Jean-François
Sankur Ocan
Publication venue
Publication date: 01/01/2014
Field of study

We introduce Multi-Environment Markov Decision Processes (MEMDPs) which are MDPs with a set of probabilistic transition functions. The goal in a MEMDP is to synthesize a single controller with guaranteed performances against all environments even though the environment is unknown a priori. While MEMDPs can be seen as a special class of partially observable MDPs, we show that several verification problems that are undecidable for partially observable MDPs, are decidable for MEMDPs and sometimes have even efficient solutions

arXiv.org e-Print Archive

CiteSeerX

DROPS Dagstuhl Research Online Publication Server

DI-fusion

Quantitative games with interval objectives

Author: Hunter Paul
Raskin Jean-François
Publication venue
Publication date: 01/01/2014
Field of study

Traditionally quantitative games such as mean-payoff games and discount sum games have two players -- one trying to maximize the payoff, the other trying to minimize it. The associated decision problem, "Can Eve (the maximizer) achieve, for example, a positive payoff?" can be thought of as one player trying to attain a payoff in the interval

(0,\infty)

. In this paper we consider the more general problem of determining if a player can attain a payoff in a finite union of arbitrary intervals for various payoff functions (liminf, mean-payoff, discount sum, total sum). In particular this includes the interesting exact-value problem, "Can Eve achieve a payoff of exactly (e.g.) 0?"Comment: Full version of CONCUR submissio

arXiv.org e-Print Archive

CiteSeerX

DROPS Dagstuhl Research Online Publication Server

DI-fusion

On the complexity of heterogeneous multidimensional quantitative games

Author: Bruyère Véronique
Hautem Quentin
Raskin Jean-François
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we study two-player zero-sum turn-based games played on a finite multidimensional weighted graph. In recent papers all dimensions use the same measure, whereas here we allow to combine different measures. Such heterogeneous multidimensional quantitative games provide a general and natural model for the study of reactive system synthesis. We focus on classical measures like the Inf, Sup, LimInf, and LimSup of the weights seen along the play, as well as on the window mean-payoff (WMP) measure. This new measure is a natural strengthening of the mean-payoff measure. We allow objectives defined as Boolean combinations of heterogeneous constraints. While multidimensional games with Boolean combinations of mean-payoff constraints are undecidable, we show that the problem becomes EXPTIME-complete for DNF/CNF Boolean combinations of heterogeneous measures taken among {WMP, Inf, Sup, LimInf, LimSup} and that exponential memory strategies are sufficient for both players to win. We provide a detailed study of the complexity and the memory requirements when the Boolean combination of the measures is replaced by an intersection. EXPTIME-completeness and exponential memory strategies still hold for the intersection of measures in {WMP, Inf, Sup, LimInf, LimSup}, and we get PSPACE-completeness when WMP measure is no longer considered. To avoid EXPTIME-or PSPACE-hardness, we impose at most one occurrence of WMP measure and fix the number of Sup measures, and we propose several refinements (on the number of occurrences of the other measures) for which we get polynomial algorithms and lower memory requirements. For all the considered classes of games, we also study parameterized complexity

arXiv.org e-Print Archive

DROPS Dagstuhl Research Online Publication Server

DI-fusion

Assume-Admissible Synthesis

Author: Brenguier Romain
Raskin Jean-François
Sankur Ocan
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we introduce a novel rule for synthesis of reactive systems, applicable to systems made of n components which have each their own objectives. It is based on the notion of admissible strategies. We compare our novel rule with previous rules defined in the literature, and we show that contrary to the previous proposals, our rule defines sets of solutions which are rectangular. This property leads to solutions which are robust and resilient. We provide algorithms with optimal complexity and also an abstraction framework.Comment: 31 page

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

DROPS Dagstuhl Research Online Publication Server

DI-fusion

Hal-Diderot

HAL-Rennes 1

Expected Window Mean-Payoff

Author: Bordais Benjamin
Guha Shibashis
Raskin Jean-François
Publication venue
Publication date: 01/12/2019
Field of study

In the window mean-payoff objective, given an infinite path, instead of considering a long run average, we consider the minimum payoff that can be ensured at every position of the path over a finite window that slides over the entire path. Chatterjee et al. studied the problem to decide if in a two-player game, Player 1 has a strategy to ensure a window mean-payoff of at least 0. In this work, we consider a function that given a path returns the supremum value of the window mean-payoff that can be ensured over the path and we show how to compute its expected value in Markov chains and Markov decision processes. We consider two variants of the function: Fixed window mean-payoff in which a fixed window length

l_{max}

is provided; and Bounded window mean-payoff in which we compute the maximum possible value of the window mean-payoff over all possible window lengths. Further, for both variants, we consider (i) a direct version of the problem where for each path, the payoff that can be ensured from its very beginning and (ii) a non-direct version that is the prefix independent counterpart of the direct version of the problem.Comment: Replaced PP-hardness of direct fixed window objective with PSPACE-hardness, added alternative definition of window mean-payof

arXiv.org e-Print Archive

DI-fusion

Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes

Author: Bohy Aaron
Bruyère Véronique
Raskin Jean-François
Publication venue: 'Open Publishing Association'
Publication date: 01/07/2014
Field of study

When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and LTL synthesis, we report promising experimental results w.r.t. both the run time and the memory consumption.Comment: In Proceedings SYNT 2014, arXiv:1407.493

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

DI-fusion