253 research outputs found
Optimal Strategies in Infinite-state Stochastic Reachability Games
We consider perfect-information reachability stochastic games for 2 players
on infinite graphs. We identify a subclass of such games, and prove two
interesting properties of it: first, Player Max always has optimal strategies
in games from this subclass, and second, these games are strongly determined.
The subclass is defined by the property that the set of all values can only
have one accumulation point -- 0. Our results nicely mirror recent results for
finitely-branching games, where, on the contrary, Player Min always has optimal
strategies. However, our proof methods are substantially different, because the
roles of the players are not symmetric. We also do not restrict the branching
of the games. Finally, we apply our results in the context of recently studied
One-Counter stochastic games
Tableaux for Policy Synthesis for MDPs with PCTL* Constraints
Markov decision processes (MDPs) are the standard formalism for modelling
sequential decision making in stochastic environments. Policy synthesis
addresses the problem of how to control or limit the decisions an agent makes
so that a given specification is met. In this paper we consider PCTL*, the
probabilistic counterpart of CTL*, as the specification language. Because in
general the policy synthesis problem for PCTL* is undecidable, we restrict to
policies whose execution history memory is finitely bounded a priori.
Surprisingly, no algorithm for policy synthesis for this natural and
expressive framework has been developed so far. We close this gap and describe
a tableau-based algorithm that, given an MDP and a PCTL* specification, derives
in a non-deterministic way a system of (possibly nonlinear) equalities and
inequalities. The solutions of this system, if any, describe the desired
(stochastic) policies.
Our main result in this paper is the correctness of our method, i.e.,
soundness, completeness and termination.Comment: This is a long version of a conference paper published at TABLEAUX
2017. It contains proofs of the main results and fixes a bug. See the
footnote on page 1 for detail
Value Iteration for Long-run Average Reward in Markov Decision Processes
Markov decision processes (MDPs) are standard models for probabilistic
systems with non-deterministic behaviours. Long-run average rewards provide a
mathematically elegant formalism for expressing long term performance. Value
iteration (VI) is one of the simplest and most efficient algorithmic approaches
to MDPs with other properties, such as reachability objectives. Unfortunately,
a naive extension of VI does not work for MDPs with long-run average rewards,
as there is no known stopping criterion. In this work our contributions are
threefold. (1) We refute a conjecture related to stopping criteria for MDPs
with long-run average rewards. (2) We present two practical algorithms for MDPs
with long-run average rewards based on VI. First, we show that a combination of
applying VI locally for each maximal end-component (MEC) and VI for
reachability objectives can provide approximation guarantees. Second, extending
the above approach with a simulation-guided on-demand variant of VI, we present
an anytime algorithm that is able to deal with very large models. (3) Finally,
we present experimental results showing that our methods significantly
outperform the standard approaches on several benchmarks
Zero-Reachability in Probabilistic Multi-Counter Automata
We study the qualitative and quantitative zero-reachability problem in
probabilistic multi-counter systems. We identify the undecidable variants of
the problems, and then we concentrate on the remaining two cases. In the first
case, when we are interested in the probability of all runs that visit zero in
some counter, we show that the qualitative zero-reachability is decidable in
time which is polynomial in the size of a given pMC and doubly exponential in
the number of counters. Further, we show that the probability of all
zero-reaching runs can be effectively approximated up to an arbitrarily small
given error epsilon > 0 in time which is polynomial in log(epsilon),
exponential in the size of a given pMC, and doubly exponential in the number of
counters. In the second case, we are interested in the probability of all runs
that visit zero in some counter different from the last counter. Here we show
that the qualitative zero-reachability is decidable and SquareRootSum-hard, and
the probability of all zero-reaching runs can be effectively approximated up to
an arbitrarily small given error epsilon > 0 (these result applies to pMC
satisfying a suitable technical condition that can be verified in polynomial
time). The proof techniques invented in the second case allow to construct
counterexamples for some classical results about ergodicity in stochastic Petri
nets.Comment: 20 page
Optimizing Performance of Continuous-Time Stochastic Systems using Timeout Synthesis
We consider parametric version of fixed-delay continuous-time Markov chains
(or equivalently deterministic and stochastic Petri nets, DSPN) where
fixed-delay transitions are specified by parameters, rather than concrete
values. Our goal is to synthesize values of these parameters that, for a given
cost function, minimise expected total cost incurred before reaching a given
set of target states. We show that under mild assumptions, optimal values of
parameters can be effectively approximated using translation to a Markov
decision process (MDP) whose actions correspond to discretized values of these
parameters
Extension of PRISM by Synthesis of Optimal Timeouts in Fixed-Delay CTMC
We present a practically appealing extension of the probabilistic model
checker PRISM rendering it to handle fixed-delay continuous-time Markov chains
(fdCTMCs) with rewards, the equivalent formalism to the deterministic and
stochastic Petri nets (DSPNs). fdCTMCs allow transitions with fixed-delays (or
timeouts) on top of the traditional transitions with exponential rates. Our
extension supports an evaluation of expected reward until reaching a given set
of target states. The main contribution is that, considering the fixed-delays
as parameters, we implemented a synthesis algorithm that computes the
epsilon-optimal values of the fixed-delays minimizing the expected reward. We
provide a performance evaluation of the synthesis on practical examples
Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms
Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events
that can be non-exponentially distributed. Within parametric ACTMCs, the
parameters of alarm-event distributions are not given explicitly and can be
subject of parameter synthesis. An algorithm solving the -optimal
parameter synthesis problem for parametric ACTMCs with long-run average
optimization objectives is presented. Our approach is based on reduction of the
problem to finding long-run average optimal strategies in semi-Markov decision
processes (semi-MDPs) and sufficient discretization of parameter (i.e., action)
space. Since the set of actions in the discretized semi-MDP can be very large,
a straightforward approach based on explicit action-space construction fails to
solve even simple instances of the problem. The presented algorithm uses an
enhanced policy iteration on symbolic representations of the action space. The
soundness of the algorithm is established for parametric ACTMCs with
alarm-event distributions satisfying four mild assumptions that are shown to
hold for uniform, Dirac and Weibull distributions in particular, but are
satisfied for many other distributions as well. An experimental implementation
shows that the symbolic technique substantially improves the efficiency of the
synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference
on Quantitative Evaluation of SysTems (QEST) 201
Weak MSO+U with Path Quantifiers over Infinite Trees
This paper shows that over infinite trees, satisfiability is decidable for
weak monadic second-order logic extended by the unbounding quantifier U and
quantification over infinite paths. The proof is by reduction to emptiness for
a certain automaton model, while emptiness for the automaton model is decided
using profinite trees.Comment: version of an ICALP 2014 paper with appendice
Long-term variability of drought indices in the Czech Lands and effects of external forcings and large-scale climate variability modes
While a considerable number of records document the temporal variability of
droughts for central Europe, the understanding of its underlying causes remains
limited. In this contribution, time series of three drought indices (Standardized Precipitation Index –
SPI; Standardized Precipitation Evapotranspiration Index – SPEI; Palmer Drought Severity Index – PDSI) are analyzed with regard to mid- to long-term drought variability
in the Czech Lands and its potential links to external forcings and internal
climate variability modes over the 1501–2006 period. Employing instrumental
and proxy-based data characterizing the external climate forcings (solar and
volcanic activity, greenhouse gases) in parallel with series representing the
activity of selected climate variability modes (El Niño–Southern
Oscillation – ENSO; Atlantic Multidecadal Oscillation – AMO; Pacific
Decadal Oscillation – PDO; North Atlantic Oscillation – NAO), regression
and wavelet analyses were deployed to identify and quantify the temporal
variability patterns of drought indices and similarity between individual
signals. Aside from a strong connection to the NAO, temperatures in the AMO
and (particularly) PDO regions were disclosed as one of the possible drivers
of inter-decadal variability in the Czech drought regime. Colder and wetter
episodes were found to coincide with increased volcanic activity, especially
in summer, while no clear signature of solar activity was found. In addition
to identification of the links themselves, their temporal stability and
structure of their shared periodicities were investigated. The oscillations
at periods of approximately 60–100 years were found to be potentially
relevant in establishing the teleconnections affecting the long-term
variability of central European droughts.</p
- …