6 research outputs found

    Learning Markov Decision Processes for Model Checking

    Full text link
    Constructing an accurate system model for formal model verification can be both resource demanding and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically learning system models based on observed system behaviors. In this paper we extend the algorithm on learning probabilistic automata to reactive systems, where the observed system behavior is in the form of alternating sequences of inputs and outputs. We propose an algorithm for automatically learning a deterministic labeled Markov decision process model from the observed behavior of a reactive system. The proposed learning algorithm is adapted from algorithms for learning deterministic probabilistic finite automata, and extended to include both probabilistic and nondeterministic transitions. The algorithm is empirically analyzed and evaluated by learning system models of slot machines. The evaluation is performed by analyzing the probabilistic linear temporal logic properties of the system as well as by analyzing the schedulers, in particular the optimal schedulers, induced by the learned models.Comment: In Proceedings QFM 2012, arXiv:1212.345

    Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

    Full text link
    We consider synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Markov decision process (PAC-MDP) methodology. The algorithm attains an ε\varepsilon-approximately optimal policy with probability 1−δ1-\delta using samples (i.e. observations), time and space that grow polynomially with the size of the MDP, the size of the automaton expressing the temporal logic specification, 1ε\frac{1}{\varepsilon}, 1δ\frac{1}{\delta} and a finite time horizon. In this approach, the system maintains a model of the initially unknown MDP, and constructs a product MDP based on its learned model and the specification automaton that expresses the temporal logic constraints. During execution, the policy is iteratively updated using observation of the transitions taken by the system. The iteration terminates in finitely many steps. With high probability, the resulting policy is such that, for any state, the difference between the probability of satisfying the specification under this policy and the optimal one is within a predefined bound.Comment: 9 pages, 5 figures, Accepted by 2014 Robotics: Science and Systems (RSS

    Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study

    Get PDF
    Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? In this work, we investigate existing algorithms for learning probabilistic models for model checking, propose an evolution-based approach for better controlling the degree of generalization and conduct an empirical study in order to answer the questions. One of our findings is that the effectiveness of learning may sometimes be limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP

    Active Learning of Markov Decision Processes for System Verification

    Get PDF
    corecore