Search CORE

11 research outputs found

Parity Objectives in Countable MDPs

Author: Kiefer Stefan
Mayr Richard
Shirmohammadi Mahsa
Wojtczak Dominik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

We study countably infinite MDPs with parity objectives, and special cases with a bounded number of colors in the Mostowski hierarchy (including reachability, safety, Büchi and co-Büchi). In finite MDPs there always exist optimal memoryless deterministic (MD) strategies for parity objectives, but this does not generally hold for countably infinite MDPs. In particular, optimal strategies need not exist. For countable infinite MDPs, we provide a complete picture of the memory requirements of optimal (resp., c-optimal) strategies for all objectives in the Mostowski hierarchy. In particular, there is a strong dichotomy between two different types of objectives. For the first type, optimal strategies, if they exist, can be chosen MD, while for the second type optimal strategies require infinite memory. (I.e., for all objectives in the Mostowski hierarchy, if finite-memory randomized strategies suffice then also MD-strategies suffice.) Similarly, some objectives admit c-optimal MD-strategies, while for others c-optimal strategies require infinite memory. Such a dichotomy also holds for the subclass of countably infinite MDPs that are finitely branching, though more objectives admit MD-strategies here

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

HAL Descartes

Edinburgh Research Explorer

Oxford University Research Archive

Hal-Diderot

Strategy Complexity of Parity Objectives in Countable MDPs

Author: Kiefer Stefan
Mayr Richard
Shirmohammadi Mahsa
Totzke Patrick
Publication venue
Publication date: 01/01/2020
Field of study

We study countably infinite MDPs with parity objectives. Unlike in finite MDPs, optimal strategies need not exist, and may require infinite memory if they do. We provide a complete picture of the exact strategy complexity of

\varepsilon

-optimal strategies (and optimal strategies, where they exist) for all subclasses of parity objectives in the Mostowski hierarchy. Either MD-strategies, Markov strategies, or 1-bit Markov strategies are necessary and sufficient, depending on the number of colors, the branching degree of the MDP, and whether one considers

\varepsilon

-optimal or optimal strategies. In particular, 1-bit Markov strategies are necessary and sufficient for

\varepsilon

-optimal (resp. optimal) strategies for general parity objectives.Comment: This is the full version of a paper presented at CONCUR 202

arXiv.org e-Print Archive

University of Liverpool Repository

HAL Descartes

Edinburgh Research Explorer

Dagstuhl Research Online Publication Server

Oxford University Research Archive

Strategy Complexity of Threshold Payoff with Applications to Optimal Expected Payoff

Author: Mayr Richard
Munday Eric
Publication venue
Publication date: 23/11/2022
Field of study

We study countably infinite Markov decision processes (MDPs) with transition rewards. The

\limsup

(resp.

\liminf

) threshold objective is to maximize the probability that the

\limsup

(resp.

\liminf

) of the infinite sequence of transition rewards is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., the upper and lower bounds on the memory required by

\varepsilon

-optimal (resp. optimal) strategies. We then apply these results to solve two open problems from [Sudderth, Decisions in Economics and Finance, 2020] about the strategy complexity of optimal strategies for the expected

\limsup

(resp.

\liminf

) payoff.Comment: 53 page

arXiv.org e-Print Archive

Optimal Strategies in Concurrent Reachability Games

Author: Bordais Benjamin
Bouyer Patricia
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th EACSL Annual Conference on Computer Science Logic (CSL 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

Taming denumerable Markov decision processes with decisiveness

Author: Bertrand Nathalie
Bouyer Patricia
Brihaye Thomas
Fournier Paulin
Publication venue
Publication date: 24/08/2020
Field of study

Decisiveness has proven to be an elegant concept for denumerable Markov chains: it is general enough to encompass several natural classes of denumerable Markov chains, and is a sufficient condition for simple qualitative and approximate quantitative model checking algorithms to exist. In this paper, we explore how to extend the notion of decisiveness to Markov decision processes. Compared to Markov chains, the extra non-determinism can be resolved in an adversarial or cooperative way, yielding two natural notions of decisiveness. We then explore whether these notions yield model checking procedures concerning the infimum and supremum probabilities of reachability properties

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Strategy Complexity of Point Payoff, Mean Payoff and Total Payoff Objectives in Countable MDPs

Author: Mayr Richard
Munday Eric
Publication venue: 'Centre pour la Communication Scientifique Directe (CCSD)'
Publication date: 10/03/2022
Field of study

We study countably infinite Markov decision processes (MDPs) with real-valued transition rewards. Every infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Mean payoff (the sequence of the sums of all rewards so far, divided by the number of steps), and 3. Total payoff (the sequence of the sums of all rewards so far). For each payoff type, the objective is to maximize the probability that the

\liminf

is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for

\varepsilon

-optimal (resp. optimal) strategies. Some cases can be won with memoryless deterministic strategies, while others require a step counter, a reward counter, or both.Comment: Revised and extended journal version of results presented at the CONCUR 2021 conference. For a special issue in the arxiv overlay journal LMCS (https://lmcs.episciences.org). This is not a duplicate of arXiv:2107.03287 (the conference version), but the significantly changed journal version for LMCS (which uses arXiv as a backend

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

Edinburgh Research Explorer

Strategy Complexity of Reachability in Countable Stochastic 2-Player Games

Author: Kiefer Stefan
Mayr Richard
Shirmohammadi Mahsa
Totzke Patrick
Publication venue
Publication date: 27/10/2022
Field of study

We study countably infinite stochastic 2-player games with reachability objectives. Our results provide a complete picture of the memory requirements of

\varepsilon

-optimal (resp. optimal) strategies. These results depend on the size of the players' action sets and on whether one requires strategies that are uniform (i.e., independent of the start state). Our main result is that

\varepsilon

-optimal (resp. optimal) Maximizer strategies require infinite memory if Minimizer is allowed infinite action sets. This lower bound holds even under very strong restrictions. Even in the special case of infinitely branching turn-based reachability games, even if all states allow an almost surely winning Maximizer strategy, strategies with a step counter plus finite private memory are still useless. Regarding uniformity, we show that for Maximizer there need not exist positional (i.e., memoryless) uniformly

\varepsilon

-optimal strategies even in the special case of finite action sets or in finitely branching turn-based games. On the other hand, in games with finite action sets, there always exists a uniformly

\varepsilon

-optimal Maximizer strategy that uses just one bit of public memory

arXiv.org e-Print Archive

Strategy Complexity of Point Payoff, Mean Payoff and Total Payoff Objectives in Countable MDPs

Author: Eric Munday
Richard Mayr
Publication venue: Logical Methods in Computer Science e.V.
Publication date: 01/03/2023
Field of study

\liminf

is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for

\varepsilon

-optimal (resp. optimal) strategies. Some cases can be won with memoryless deterministic strategies, while others require a step counter, a reward counter, or both

Directory of Open Access Journals

Strategy complexity of lim sup and lim inf payoff objectives in countable MDPs

Author: Munday Eric
Publication venue: The University of Edinburgh
Publication date: 30/08/2023
Field of study

We study countably infinite Markov decision processes (MDPs) with real-valued tran- sition rewards. A strategy is a function which decides how plays proceed within the MDP. Every strategy induces a set of infinite runs in the MDP and each infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Mean payoff (the sequence of the sums of all rewards so far, divided by the number of steps), and 3. Total payoff (the sequence of the sums of all rewards so far). For each payoff type, the threshold objective is to maximise the probability that the lim sup/lim inf is non-negative. We are interested in the strategy complexity of the above objectives, i.e. the amount of memory and/or randomisation that a strategy needs access to in order to play well (optimally resp. ε-optimally). Our results seek not only to decide whether an objective requires finite or infinite memory, but in the case of infinite memory, what kind of infinite memory is necessary and sufficient. For example, a step counter which acts as a clock, or a reward counter which sums up the seen rewards may be sufficient. We compare the lim sup/lim inf point payoff objectives to the Büchi/co-Büchi ob- jectives which, given a set of states or transitions, seek to maximise the probability that this set is visited infinitely/finitely often. Convergence effects are what differen- tiate lim sup/lim inf point payoff objectives from Büchi/co-Büchi. For example, the sequence −1/2, −1/3, −1/4 . . . does satisfy lim sup ≥ 0 and lim inf ≥ 0 despite all of the rewards being negative. It is in dealing with these effects which we make our main technical contributions. We establish a complete picture of the strategy complexity for both the lim sup and lim inf point payoff objectives. In particular we show that opti- mal lim sup requires either randomisation or access to a step counter and that lim inf of point payoff requires a step counter (but not more) when the underlying MDP is infinitely branching. We also comprehensively pin down the strategy complexity for the lim inf total and mean payoff objectives. This result requires a novel counterexample involving unboundedly growing rewards as well as finely tuned transition probabilities which force the player to use memory in order to mimic what occurred in past random events. This allows us to show that both of these objectives require the use of both a step counter as well as a reward counter. We apply our results to solve two open problems from Sudderth [35] about the strategy complexity of optimal strategies for the expected lim sup/lim inf point pay- off. We achieve this by reducing each objective to its respective optimal threshold lim sup/lim inf point payoff counterpart. Thus we are able to conclude that they share the same optimal strategy complexity

Edinburgh Research Archive

Parity objectives in countable MDPs

Author: Kiefer S
Mayr R
Shirmohammadi M
Wojtczak D
Publication venue: Institute for Electrical and Electronics Engineers
Publication date: 01/01/2017
Field of study

We study countably infinite MDPs with parity objectives, and special cases with a bounded number of colors in the Mostowski hierarchy (including reachability, safety, Büchi and co-Büchi). In finite MDPs there always exist optimal memoryless deterministic (MD) strategies for parity objectives, but this does not generally hold for countably infiniteMDPs. In particular, optimal strategies need not exist. For countable infinite MDPs, we provide a complete picture of the memory requirements of optimal (resp., ǫ-optimal) strategies for all objectives in the Mostowski hierarchy. In particular, there is a strong dichotomy between two different types of objectives. For the first type, optimal strategies, if they exist, can be chosen MD, while for the second type optimal strategies require infinite memory. (I.e., for all objectives in the Mostowski hierarchy, if finite-memory randomized strategies suffice then also MD-strategies suffice.) Similarly, some objectives admit ǫ-optimal MD-strategies, while for others ǫ-optimal strategies require infinite memory. Such a dichotomy also holds for the subclass of countably infinite MDPs that are finitely branching, though more objectives admit MD-strategies here.</p

Oxford University Research Archive