66 research outputs found
Bisimulations and Logical Characterizations on Continuous-time Markov Decision Processes
In this paper we study strong and weak bisimulation equivalences for
continuous-time Markov decision processes (CTMDPs) and the logical
characterizations of these relations with respect to the continuous-time
stochastic logic (CSL). For strong bisimulation, it is well known that it is
strictly finer than CSL equivalence. In this paper we propose strong and weak
bisimulations for CTMDPs and show that for a subclass of CTMDPs, strong and
weak bisimulations are both sound and complete with respect to the equivalences
induced by CSL and the sub-logic of CSL without next operator respectively. We
then consider a standard extension of CSL, and show that it and its sub-logic
without X can be fully characterized by strong and weak bisimulations
respectively over arbitrary CTMDPs.Comment: The conference version of this paper was published at VMCAI 201
Transient Reward Approximation for Continuous-Time Markov Chains
We are interested in the analysis of very large continuous-time Markov chains
(CTMCs) with many distinct rates. Such models arise naturally in the context of
reliability analysis, e.g., of computer network performability analysis, of
power grids, of computer virus vulnerability, and in the study of crowd
dynamics. We use abstraction techniques together with novel algorithms for the
computation of bounds on the expected final and accumulated rewards in
continuous-time Markov decision processes (CTMDPs). These ingredients are
combined in a partly symbolic and partly explicit (symblicit) analysis
approach. In particular, we circumvent the use of multi-terminal decision
diagrams, because the latter do not work well if facing a large number of
different rates. We demonstrate the practical applicability and efficiency of
the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit
Efficient Approximation of Optimal Control for Continuous-Time Markov Games
We study the time-bounded reachability problem for continuous time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretization techniques to break time into discrete intervals, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(epsilon^2) on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of O(epsilon^3), O(epsilon^4), and O(epsilon^5), that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the resulting algorithms are comparable to the heuristic approach given by Buckholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide positional strategies for both players that achieve similar error bounds
Maximal Cost-Bounded Reachability Probability on Continuous-Time Markov Decision Processes
In this paper, we consider multi-dimensional maximal cost-bounded
reachability probability over continuous-time Markov decision processes
(CTMDPs). Our major contributions are as follows. Firstly, we derive an
integral characterization which states that the maximal cost-bounded
reachability probability function is the least fixed point of a system of
integral equations. Secondly, we prove that the maximal cost-bounded
reachability probability can be attained by a measurable deterministic
cost-positional scheduler. Thirdly, we provide a numerical approximation
algorithm for maximal cost-bounded reachability probability. We present these
results under the setting of both early and late schedulers
A tutorial on interactive Markov chains
Interactive Markov chains (IMCs) constitute a powerful sto- chastic model that extends both continuous-time Markov chains and labelled transition systems. IMCs enable a wide range of modelling and analysis techniques and serve as a semantic model for many industrial and scientific formalisms, such as AADL, GSPNs and many more. Applications cover various engineering contexts ranging from industrial system-on-chip manufacturing to satellite designs. We present a survey of the state-of-the-art in modelling and analysis of IMCs.\ud
We cover a set of techniques that can be utilised for compositional modelling, state space generation and reduction, and model checking. The significance of the presented material and corresponding tools is highlighted through multiple case studies
Efficient approximation of optimal control for continuous-time Markov games
We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to partition time into discrete intervals of size ε, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of , , and , that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the performance of the resulting algorithms is comparable to the heuristic approach given by Buchholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide memoryless strategies for both players that achieve similar error bounds
Efficient Approximation of Optimal Control for Markov Games
We study the time-bounded reachability problem for continuous-time Markov
decision processes (CTMDPs) and games (CTMGs). Existing techniques for this
problem use discretisation techniques to break time into discrete intervals,
and optimal control is approximated for each interval separately. Current
techniques provide an accuracy of O(\epsilon^2) on each interval, which leads
to an infeasibly large number of intervals. We propose a sequence of
approximations that achieve accuracies of O(\epsilon^3), O(\epsilon^4), and
O(\epsilon^5), that allow us to drastically reduce the number of intervals that
are considered. For CTMDPs, the performance of the resulting algorithms is
comparable to the heuristic approach given by Buckholz and Schulz, while also
being theoretically justified. All of our results generalise to CTMGs, where
our results yield the first practically implementable algorithms for this
problem. We also provide positional strategies for both players that achieve
similar error bounds
Formal Modelling for Multi-Robot Systems Under Uncertainty
Purpose of Review: To effectively synthesise and analyse multi-robot
behaviour, we require formal task-level models which accurately capture
multi-robot execution. In this paper, we review modelling formalisms for
multi-robot systems under uncertainty, and discuss how they can be used for
planning, reinforcement learning, model checking, and simulation.
Recent Findings: Recent work has investigated models which more accurately
capture multi-robot execution by considering different forms of uncertainty,
such as temporal uncertainty and partial observability, and modelling the
effects of robot interactions on action execution. Other strands of work have
presented approaches for reducing the size of multi-robot models to admit more
efficient solution methods. This can be achieved by decoupling the robots under
independence assumptions, or reasoning over higher level macro actions.
Summary: Existing multi-robot models demonstrate a trade off between
accurately capturing robot dependencies and uncertainty, and being small enough
to tractably solve real world problems. Therefore, future research should
exploit realistic assumptions over multi-robot behaviour to develop smaller
models which retain accurate representations of uncertainty and robot
interactions; and exploit the structure of multi-robot problems, such as
factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This
version of the article has been accepted for publication, after peer review
(when applicable) but is not the Version of Record and does not reflect
post-acceptance improvements, or any corrections. The Version of Record is
available online at: https://dx.doi.org/10.1007/s43154-023-00104-
Policy learning in Continuous-Time Markov Decision Processes using Gaussian Processes
Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber\u2013physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we introduce a novel method based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. Our approach presents several advantages over the classical methods based on discretisation techniques, as it does not assume the a-priori knowledge of a model that can be replaced by a black-box, and does not suffer from state-space explosion. The use of a stochastic moment-based gradient ascent algorithm to guide our search considerably improves the efficiency of learning policies and accelerates the convergence using the momentum term. We demonstrate the strong performance of our approach on two examples of non-linear population models: an epidemiology model with no permanent recovery and a queuing system with non-deterministic choice
- …