66 research outputs found

    Bisimulations and Logical Characterizations on Continuous-time Markov Decision Processes

    Full text link
    In this paper we study strong and weak bisimulation equivalences for continuous-time Markov decision processes (CTMDPs) and the logical characterizations of these relations with respect to the continuous-time stochastic logic (CSL). For strong bisimulation, it is well known that it is strictly finer than CSL equivalence. In this paper we propose strong and weak bisimulations for CTMDPs and show that for a subclass of CTMDPs, strong and weak bisimulations are both sound and complete with respect to the equivalences induced by CSL and the sub-logic of CSL without next operator respectively. We then consider a standard extension of CSL, and show that it and its sub-logic without X can be fully characterized by strong and weak bisimulations respectively over arbitrary CTMDPs.Comment: The conference version of this paper was published at VMCAI 201

    Transient Reward Approximation for Continuous-Time Markov Chains

    Full text link
    We are interested in the analysis of very large continuous-time Markov chains (CTMCs) with many distinct rates. Such models arise naturally in the context of reliability analysis, e.g., of computer network performability analysis, of power grids, of computer virus vulnerability, and in the study of crowd dynamics. We use abstraction techniques together with novel algorithms for the computation of bounds on the expected final and accumulated rewards in continuous-time Markov decision processes (CTMDPs). These ingredients are combined in a partly symbolic and partly explicit (symblicit) analysis approach. In particular, we circumvent the use of multi-terminal decision diagrams, because the latter do not work well if facing a large number of different rates. We demonstrate the practical applicability and efficiency of the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit

    Efficient Approximation of Optimal Control for Continuous-Time Markov Games

    Get PDF
    We study the time-bounded reachability problem for continuous time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretization techniques to break time into discrete intervals, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(epsilon^2) on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of O(epsilon^3), O(epsilon^4), and O(epsilon^5), that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the resulting algorithms are comparable to the heuristic approach given by Buckholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide positional strategies for both players that achieve similar error bounds

    Maximal Cost-Bounded Reachability Probability on Continuous-Time Markov Decision Processes

    Full text link
    In this paper, we consider multi-dimensional maximal cost-bounded reachability probability over continuous-time Markov decision processes (CTMDPs). Our major contributions are as follows. Firstly, we derive an integral characterization which states that the maximal cost-bounded reachability probability function is the least fixed point of a system of integral equations. Secondly, we prove that the maximal cost-bounded reachability probability can be attained by a measurable deterministic cost-positional scheduler. Thirdly, we provide a numerical approximation algorithm for maximal cost-bounded reachability probability. We present these results under the setting of both early and late schedulers

    A tutorial on interactive Markov chains

    Get PDF
    Interactive Markov chains (IMCs) constitute a powerful sto- chastic model that extends both continuous-time Markov chains and labelled transition systems. IMCs enable a wide range of modelling and analysis techniques and serve as a semantic model for many industrial and scientific formalisms, such as AADL, GSPNs and many more. Applications cover various engineering contexts ranging from industrial system-on-chip manufacturing to satellite designs. We present a survey of the state-of-the-art in modelling and analysis of IMCs.\ud We cover a set of techniques that can be utilised for compositional modelling, state space generation and reduction, and model checking. The significance of the presented material and corresponding tools is highlighted through multiple case studies

    Efficient approximation of optimal control for continuous-time Markov games

    Get PDF
    We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to partition time into discrete intervals of size ε, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of , , and , that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the performance of the resulting algorithms is comparable to the heuristic approach given by Buchholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide memoryless strategies for both players that achieve similar error bounds

    Efficient Approximation of Optimal Control for Markov Games

    Get PDF
    We study the time-bounded reachability problem for continuous-time Markov decision processes (CTMDPs) and games (CTMGs). Existing techniques for this problem use discretisation techniques to break time into discrete intervals, and optimal control is approximated for each interval separately. Current techniques provide an accuracy of O(\epsilon^2) on each interval, which leads to an infeasibly large number of intervals. We propose a sequence of approximations that achieve accuracies of O(\epsilon^3), O(\epsilon^4), and O(\epsilon^5), that allow us to drastically reduce the number of intervals that are considered. For CTMDPs, the performance of the resulting algorithms is comparable to the heuristic approach given by Buckholz and Schulz, while also being theoretically justified. All of our results generalise to CTMGs, where our results yield the first practically implementable algorithms for this problem. We also provide positional strategies for both players that achieve similar error bounds

    Formal Modelling for Multi-Robot Systems Under Uncertainty

    Get PDF
    Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investigated models which more accurately capture multi-robot execution by considering different forms of uncertainty, such as temporal uncertainty and partial observability, and modelling the effects of robot interactions on action execution. Other strands of work have presented approaches for reducing the size of multi-robot models to admit more efficient solution methods. This can be achieved by decoupling the robots under independence assumptions, or reasoning over higher level macro actions. Summary: Existing multi-robot models demonstrate a trade off between accurately capturing robot dependencies and uncertainty, and being small enough to tractably solve real world problems. Therefore, future research should exploit realistic assumptions over multi-robot behaviour to develop smaller models which retain accurate representations of uncertainty and robot interactions; and exploit the structure of multi-robot problems, such as factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://dx.doi.org/10.1007/s43154-023-00104-

    Policy learning in Continuous-Time Markov Decision Processes using Gaussian Processes

    Get PDF
    Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber\u2013physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we introduce a novel method based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. Our approach presents several advantages over the classical methods based on discretisation techniques, as it does not assume the a-priori knowledge of a model that can be replaced by a black-box, and does not suffer from state-space explosion. The use of a stochastic moment-based gradient ascent algorithm to guide our search considerably improves the efficiency of learning policies and accelerates the convergence using the momentum term. We demonstrate the strong performance of our approach on two examples of non-linear population models: an epidemiology model with no permanent recovery and a queuing system with non-deterministic choice
    • …