6,828 research outputs found

    Probabilistic Guarantees for Safe Deep Reinforcement Learning

    Full text link
    Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems

    Quantitative Approximation of the Probability Distribution of a Markov Process by Formal Abstractions

    Full text link
    The goal of this work is to formally abstract a Markov process evolving in discrete time over a general state space as a finite-state Markov chain, with the objective of precisely approximating its state probability distribution in time, which allows for its approximate, faster computation by that of the Markov chain. The approach is based on formal abstractions and employs an arbitrary finite partition of the state space of the Markov process, and the computation of average transition probabilities between partition sets. The abstraction technique is formal, in that it comes with guarantees on the introduced approximation that depend on the diameters of the partitions: as such, they can be tuned at will. Further in the case of Markov processes with unbounded state spaces, a procedure for precisely truncating the state space within a compact set is provided, together with an error bound that depends on the asymptotic properties of the transition kernel of the original process. The overall abstraction algorithm, which practically hinges on piecewise constant approximations of the density functions of the Markov process, is extended to higher-order function approximations: these can lead to improved error bounds and associated lower computational requirements. The approach is practically tested to compute probabilistic invariance of the Markov process under study, and is compared to a known alternative approach from the literature.Comment: 29 pages, Journal of Logical Methods in Computer Scienc

    Aggregation and Control of Populations of Thermostatically Controlled Loads by Formal Abstractions

    Full text link
    This work discusses a two-step procedure, based on formal abstractions, to generate a finite-space stochastic dynamical model as an aggregation of the continuous temperature dynamics of a homogeneous population of Thermostatically Controlled Loads (TCL). The temperature of a single TCL is described by a stochastic difference equation and the TCL status (ON, OFF) by a deterministic switching mechanism. The procedure is formal as it allows the exact quantification of the error introduced by the abstraction -- as such it builds and improves on a known, earlier approximation technique in the literature. Further, the contribution discusses the extension to the case of a heterogeneous population of TCL by means of two approaches resulting in the notion of approximate abstractions. It moreover investigates the problem of global (population-level) regulation and load balancing for the case of TCL that are dependent on a control input. The procedure is tested on a case study and benchmarked against the mentioned alternative approach in the literature.Comment: 40 pages, 21 figures; the paper generalizes the result of conference publication: S. Esmaeil Zadeh Soudjani and A. Abate, "Aggregation of Thermostatically Controlled Loads by Formal Abstractions," Proceedings of the European Control Conference 2013, pp. 4232-4237. version 2: added references for section

    On the connections between PCTL and Dynamic Programming

    Full text link
    Probabilistic Computation Tree Logic (PCTL) is a well-known modal logic which has become a standard for expressing temporal properties of finite-state Markov chains in the context of automated model checking. In this paper, we give a definition of PCTL for noncountable-space Markov chains, and we show that there is a substantial affinity between certain of its operators and problems of Dynamic Programming. After proving some uniqueness properties of the solutions to the latter, we conclude the paper with two examples to show that some recovery strategies in practical applications, which are naturally stated as reach-avoid problems, can be actually viewed as particular cases of PCTL formulas.Comment: Submitte

    On the Performance of Short Block Codes over Finite-State Channels in the Rare-Transition Regime

    Full text link
    As the mobile application landscape expands, wireless networks are tasked with supporting different connection profiles, including real-time traffic and delay-sensitive communications. Among many ensuing engineering challenges is the need to better understand the fundamental limits of forward error correction in non-asymptotic regimes. This article characterizes the performance of random block codes over finite-state channels and evaluates their queueing performance under maximum-likelihood decoding. In particular, classical results from information theory are revisited in the context of channels with rare transitions, and bounds on the probabilities of decoding failure are derived for random codes. This creates an analysis framework where channel dependencies within and across codewords are preserved. Such results are subsequently integrated into a queueing problem formulation. For instance, it is shown that, for random coding on the Gilbert-Elliott channel, the performance analysis based on upper bounds on error probability provides very good estimates of system performance and optimum code parameters. Overall, this study offers new insights about the impact of channel correlation on the performance of delay-aware, point-to-point communication links. It also provides novel guidelines on how to select code rates and block lengths for real-time traffic over wireless communication infrastructures
    • …
    corecore