29 research outputs found
Transient Reward Approximation for Continuous-Time Markov Chains
We are interested in the analysis of very large continuous-time Markov chains
(CTMCs) with many distinct rates. Such models arise naturally in the context of
reliability analysis, e.g., of computer network performability analysis, of
power grids, of computer virus vulnerability, and in the study of crowd
dynamics. We use abstraction techniques together with novel algorithms for the
computation of bounds on the expected final and accumulated rewards in
continuous-time Markov decision processes (CTMDPs). These ingredients are
combined in a partly symbolic and partly explicit (symblicit) analysis
approach. In particular, we circumvent the use of multi-terminal decision
diagrams, because the latter do not work well if facing a large number of
different rates. We demonstrate the practical applicability and efficiency of
the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit
Maximal Cost-Bounded Reachability Probability on Continuous-Time Markov Decision Processes
In this paper, we consider multi-dimensional maximal cost-bounded
reachability probability over continuous-time Markov decision processes
(CTMDPs). Our major contributions are as follows. Firstly, we derive an
integral characterization which states that the maximal cost-bounded
reachability probability function is the least fixed point of a system of
integral equations. Secondly, we prove that the maximal cost-bounded
reachability probability can be attained by a measurable deterministic
cost-positional scheduler. Thirdly, we provide a numerical approximation
algorithm for maximal cost-bounded reachability probability. We present these
results under the setting of both early and late schedulers
Efficient approximation of optimal control for continuous-time Markov games
We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to partition time into discrete intervals of size ε, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of , , and , that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the performance of the resulting algorithms is comparable to the heuristic approach given by Buchholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide memoryless strategies for both players that achieve similar error bounds
Efficient Approximation of Optimal Control for Continuous-Time Markov Games
We study the time-bounded reachability problem for continuous time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretization techniques to break time into discrete intervals, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(epsilon^2) on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of O(epsilon^3), O(epsilon^4), and O(epsilon^5), that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the resulting algorithms are comparable to the heuristic approach given by Buckholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide positional strategies for both players that achieve similar error bounds
Bisimulations and Logical Characterizations on Continuous-time Markov Decision Processes
In this paper we study strong and weak bisimulation equivalences for
continuous-time Markov decision processes (CTMDPs) and the logical
characterizations of these relations with respect to the continuous-time
stochastic logic (CSL). For strong bisimulation, it is well known that it is
strictly finer than CSL equivalence. In this paper we propose strong and weak
bisimulations for CTMDPs and show that for a subclass of CTMDPs, strong and
weak bisimulations are both sound and complete with respect to the equivalences
induced by CSL and the sub-logic of CSL without next operator respectively. We
then consider a standard extension of CSL, and show that it and its sub-logic
without X can be fully characterized by strong and weak bisimulations
respectively over arbitrary CTMDPs.Comment: The conference version of this paper was published at VMCAI 201
Efficient Approximation of Optimal Control for Markov Games
We study the time-bounded reachability problem for continuous-time Markov
decision processes (CTMDPs) and games (CTMGs). Existing techniques for this
problem use discretisation techniques to break time into discrete intervals,
and optimal control is approximated for each interval separately. Current
techniques provide an accuracy of O(\epsilon^2) on each interval, which leads
to an infeasibly large number of intervals. We propose a sequence of
approximations that achieve accuracies of O(\epsilon^3), O(\epsilon^4), and
O(\epsilon^5), that allow us to drastically reduce the number of intervals that
are considered. For CTMDPs, the performance of the resulting algorithms is
comparable to the heuristic approach given by Buckholz and Schulz, while also
being theoretically justified. All of our results generalise to CTMGs, where
our results yield the first practically implementable algorithms for this
problem. We also provide positional strategies for both players that achieve
similar error bounds
Policy learning in Continuous-Time Markov Decision Processes using Gaussian Processes
Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber\u2013physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we introduce a novel method based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. Our approach presents several advantages over the classical methods based on discretisation techniques, as it does not assume the a-priori knowledge of a model that can be replaced by a black-box, and does not suffer from state-space explosion. The use of a stochastic moment-based gradient ascent algorithm to guide our search considerably improves the efficiency of learning policies and accelerates the convergence using the momentum term. We demonstrate the strong performance of our approach on two examples of non-linear population models: an epidemiology model with no permanent recovery and a queuing system with non-deterministic choice