Search CORE

253 research outputs found

Optimal Strategies in Infinite-state Stochastic Reachability Games

Author: A. Condon
D. A. Martin
Giovanna D'Agostino
H. Gimbert
J. Esparza
K. Etessami
M. L. Puterman
N. Berger
Salvatore La Torre
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
T. Brázdil
Václav Brožek
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2011
Field of study

We consider perfect-information reachability stochastic games for 2 players on infinite graphs. We identify a subclass of such games, and prove two interesting properties of it: first, Player Max always has optimal strategies in games from this subclass, and second, these games are strongly determined. The subclass is defined by the property that the set of all values can only have one accumulation point -- 0. Our results nicely mirror recent results for finitely-branching games, where, on the contrary, Player Min always has optimal strategies. However, our proof methods are substantially different, because the roles of the players are not symmetric. We also do not restrict the branching of the games. Finally, we apply our results in the context of recently studied One-Counter stochastic games

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

Author: A Kučera
C Baier
C Courcoubetis
E Altman
J Kemeny
M Kwiatkowska
M Kwiatkowska
T Brázdil
T Brázdil
T Brázdil
V Forejt
XC Ding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2017
Field of study

Markov decision processes (MDPs) are the standard formalism for modelling sequential decision making in stochastic environments. Policy synthesis addresses the problem of how to control or limit the decisions an agent makes so that a given specification is met. In this paper we consider PCTL*, the probabilistic counterpart of CTL*, as the specification language. Because in general the policy synthesis problem for PCTL* is undecidable, we restrict to policies whose execution history memory is finitely bounded a priori. Surprisingly, no algorithm for policy synthesis for this natural and expressive framework has been developed so far. We close this gap and describe a tableau-based algorithm that, given an MDP and a PCTL* specification, derives in a non-deterministic way a system of (possibly nonlinear) equalities and inequalities. The solutions of this system, if any, describe the desired (stochastic) policies. Our main result in this paper is the correctness of our method, i.e., soundness, completeness and termination.Comment: This is a long version of a conference paper published at TABLEAUX 2017. It contains proofs of the main results and fixes a bug. See the footnote on page 1 for detail

arXiv.org e-Print Archive

Crossref

Value Iteration for Long-run Average Reward in Markov Decision Processes

Author: A Komuravelli
A McIver
AF Veinott
AK McIver
C Baier
C Courcoubetis
J Filar
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
M Duflot
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
ML Puterman
O Michael
RA Howard
S Giro
S Haddad
T Brázdil
T Brázdil
T Brázdil
Publication venue
Publication date: 13/07/2017
Field of study

Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

Zero-Reachability in Probabilistic Multi-Counter Automata

Author: Abdulla P. A.
Baier C.
Bozzelli L.
Brázdil T.
Iyer S.
Kemeny J.
Minsky M.
Publication venue
Publication date: 01/01/2014
Field of study

We study the qualitative and quantitative zero-reachability problem in probabilistic multi-counter systems. We identify the undecidable variants of the problems, and then we concentrate on the remaining two cases. In the first case, when we are interested in the probability of all runs that visit zero in some counter, we show that the qualitative zero-reachability is decidable in time which is polynomial in the size of a given pMC and doubly exponential in the number of counters. Further, we show that the probability of all zero-reaching runs can be effectively approximated up to an arbitrarily small given error epsilon > 0 in time which is polynomial in log(epsilon), exponential in the size of a given pMC, and doubly exponential in the number of counters. In the second case, we are interested in the probability of all runs that visit zero in some counter different from the last counter. Here we show that the qualitative zero-reachability is decidable and SquareRootSum-hard, and the probability of all zero-reaching runs can be effectively approximated up to an arbitrarily small given error epsilon > 0 (these result applies to pMC satisfying a suitable technical condition that can be verified in polynomial time). The proof techniques invented in the second case allow to construct counterexamples for some classical results about ergodicity in stochastic Petri nets.Comment: 20 page

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Optimizing Performance of Continuous-Time Stochastic Systems using Timeout Synthesis

Author: C Baier
C Haase
C Lindemann
CC Guet
EM Hahn
H Choi
JR Norris
K Etessami
K Ramamritham
L Carnevali
M Kwiatkowska
MA Marsan
MF Neuts
ML Puterman
NC Audsley
P Buchholz
R Alur
R Obermaisser
SK Jha
T Brázdil
T Brázdil
T Brázdil
W Feller
Ĺ Korenčiak
Publication venue
Publication date: 15/04/2016
Field of study

We consider parametric version of fixed-delay continuous-time Markov chains (or equivalently deterministic and stochastic Petri nets, DSPN) where fixed-delay transitions are specified by parameters, rather than concrete values. Our goal is to synthesize values of these parameters that, for a given cost function, minimise expected total cost incurred before reaching a given set of target states. We show that under mild assumptions, optimal values of parameters can be effectively approximated using translation to a Markov decision process (MDP) whose actions correspond to discretized values of these parameters

arXiv.org e-Print Archive

Crossref

Extension of PRISM by Synthesis of Optimal Timeouts in Fixed-Delay CTMC

Author: A Horváth
C Lindemann
CC Guet
M Fackrell
M Kwiatkowska
M Neuts
M Puterman
R German
S Haddad
T Brázdil
Ľ Korenčiak
Publication venue
Publication date: 10/03/2016
Field of study

We present a practically appealing extension of the probabilistic model checker PRISM rendering it to handle fixed-delay continuous-time Markov chains (fdCTMCs) with rewards, the equivalent formalism to the deterministic and stochastic Petri nets (DSPNs). fdCTMCs allow transitions with fixed-delays (or timeouts) on top of the traditional transitions with exponential rates. Our extension supports an evaluation of expected reward until reaching a given set of target states. The main contribution is that, considering the fixed-delays as parameters, we implemented a synthesis algorithm that computes the epsilon-optimal values of the fixed-delays minimizing the expected reward. We provide a performance evaluation of the synthesis on practical examples

arXiv.org e-Print Archive

Crossref

Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms

Author: A Jovanovic
A Jovanović
C Haase
C Lindemann
DLP Minh
DP Bertsekas
EG Amparore
EM Hahn
H Choi
JR Norris
L Alfaro
L-M Traonouez
M Češka
ML Puterman
PJ Haas
R German
SK Jha
T Brázdil
T Brázdil
W Nelson
Publication venue
Publication date: 20/06/2017
Field of study

Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the

\varepsilon

-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presented. Our approach is based on reduction of the problem to finding long-run average optimal strategies in semi-Markov decision processes (semi-MDPs) and sufficient discretization of parameter (i.e., action) space. Since the set of actions in the discretized semi-MDP can be very large, a straightforward approach based on explicit action-space construction fails to solve even simple instances of the problem. The presented algorithm uses an enhanced policy iteration on symbolic representations of the action space. The soundness of the algorithm is established for parametric ACTMCs with alarm-event distributions satisfying four mild assumptions that are shown to hold for uniform, Dirac and Weibull distributions in particular, but are satisfied for many other distributions as well. An experimental implementation shows that the symbolic technique substantially improves the efficiency of the synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference on Quantitative Evaluation of SysTems (QEST) 201

arXiv.org e-Print Archive

Crossref

Weak MSO+U with Path Quantifiers over Infinite Trees

Author: M. Bojańczyk
M. Bojańczyk
M. Bojańczyk
M. Boom Vanden
M.Y. Vardi
O. Kupferman
S. Hummel
S. Toruńczyk
T. Brázdil
T. Colcombet
T. Colcombet
Publication venue
Publication date: 01/01/2014
Field of study

This paper shows that over infinite trees, satisfiability is decidable for weak monadic second-order logic extended by the unbounding quantifier U and quantification over infinite paths. The proof is by reduction to emptiness for a certain automaton model, while emptiness for the automaton model is decided using profinite trees.Comment: version of an ICALP 2014 paper with appendice

arXiv.org e-Print Archive

Crossref

Algorithmic Verification of Recursive Probabilistic State Machines

Author: A. Pnueli
C. Courcoubetis
C. Manning
J. Esparza
M. Benedikt
P. Billingsley
R. Alur
S. Basu
T. Ball
T. Brázdil
T.E. Harris
Publication venue: Springer-Verlag GmbH
Publication date: 01/01/2005
Field of study

Crossref

Edinburgh Research Explorer

Long-term variability of drought indices in the Czech Lands and effects of external forcings and large-scale climate variability modes

Author: J. Mikšovský
J. Mikšovský
M. Trnka
M. Trnka
P. Pišoft
R. Brázdil
R. Brázdil
Publication venue: 'Copernicus GmbH'
Publication date: 01/04/2019
Field of study

While a considerable number of records document the temporal variability of droughts for central Europe, the understanding of its underlying causes remains limited. In this contribution, time series of three drought indices (Standardized Precipitation Index – SPI; Standardized Precipitation Evapotranspiration Index – SPEI; Palmer Drought Severity Index – PDSI) are analyzed with regard to mid- to long-term drought variability in the Czech Lands and its potential links to external forcings and internal climate variability modes over the 1501–2006 period. Employing instrumental and proxy-based data characterizing the external climate forcings (solar and volcanic activity, greenhouse gases) in parallel with series representing the activity of selected climate variability modes (El Niño–Southern Oscillation – ENSO; Atlantic Multidecadal Oscillation – AMO; Pacific Decadal Oscillation – PDO; North Atlantic Oscillation – NAO), regression and wavelet analyses were deployed to identify and quantify the temporal variability patterns of drought indices and similarity between individual signals. Aside from a strong connection to the NAO, temperatures in the AMO and (particularly) PDO regions were disclosed as one of the possible drivers of inter-decadal variability in the Czech drought regime. Colder and wetter episodes were found to coincide with increased volcanic activity, especially in summer, while no clear signature of solar activity was found. In addition to identification of the links themselves, their temporal stability and structure of their shared periodicities were investigated. The oscillations at periods of approximately 60–100 years were found to be potentially relevant in establishing the teleconnections affecting the long-term variability of central European droughts.</p

Directory of Open Access Journals