Search CORE

8,251 research outputs found

An expectation transformer approach to predicate abstraction and data independence for probabilistic programs

Author: McIver Annabelle
Ndukwu Ukachukwu
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2010
Field of study

In this paper we revisit the well-known technique of predicate abstraction to characterise performance attributes of system models incorporating probability. We recast the theory using expectation transformers, and identify transformer properties which correspond to abstractions that yield nevertheless exact bound on the performance of infinite state probabilistic systems. In addition, we extend the developed technique to the special case of "data independent" programs incorporating probability. Finally, we demonstrate the subtleness of the extended technique by using the PRISM model checking tool to analyse an infinite state protocol, obtaining exact bounds on its performance

arXiv.org e-Print Archive

Directory of Open Access Journals

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Author: E Ohn-Bar
EM Hahn
G Katz
J Garcia
J Kemeny
M Kattenbelt
M Kwiatkowska
M Lahijania
MC Machado
R Ehlers
S Junges
SEZ Soudjani
T Brázdil
V Mnih
X Huang
Publication venue
Publication date: 29/06/2020
Field of study

Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems

arXiv.org e-Print Archive

Metric Semantics and Full Abstractness for Action Refinement and Probabilistic Choice

Author: Bakker J.W. de
Hartog J.I. den
Vink E.P. de
Publication venue: Elsevier Science
Publication date: 01/01/2001
Field of study

This paper provides a case-study in the field of metric semantics for probabilistic programming. Both an operational and a denotational semantics are presented for an abstract process language L_pr, which features action refinement and probabilistic choice. The two models are constructed in the setting of complete ultrametric spaces, here based on probability measures of compact support over sequences of actions. It is shown that the standard toolkit for metric semantics works well in the probabilistic context of L_pr, e.g. in establishing the correctness of the denotational semantics with respect to the operational one. In addition, it is shown how the method of proving full abstraction --as proposed recently by the authors for a nondeterministic language with action refinement-- can be adapted to deal with the probabilistic language L_pr as well

CiteSeerX

University of Twente Research Information

Transient Reward Approximation for Continuous-Time Markov Chains

Author: Becker Bernd
Hahn Ernst Moritz
Hermanns Holger
Wimmer Ralf
Publication venue
Publication date: 01/01/2015
Field of study

We are interested in the analysis of very large continuous-time Markov chains (CTMCs) with many distinct rates. Such models arise naturally in the context of reliability analysis, e.g., of computer network performability analysis, of power grids, of computer virus vulnerability, and in the study of crowd dynamics. We use abstraction techniques together with novel algorithms for the computation of bounds on the expected final and accumulated rewards in continuous-time Markov decision processes (CTMDPs). These ingredients are combined in a partly symbolic and partly explicit (symblicit) analysis approach. In particular, we circumvent the use of multi-terminal decision diagrams, because the latter do not work well if facing a large number of different rates. We demonstrate the practical applicability and efficiency of the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit

arXiv.org e-Print Archive