6,828 research outputs found
Probabilistic Guarantees for Safe Deep Reinforcement Learning
Deep reinforcement learning has been successfully applied to many control
tasks, but the application of such agents in safety-critical scenarios has been
limited due to safety concerns. Rigorous testing of these controllers is
challenging, particularly when they operate in probabilistic environments due
to, for example, hardware faults or noisy sensors. We propose MOSAIC, an
algorithm for measuring the safety of deep reinforcement learning agents in
stochastic settings. Our approach is based on the iterative construction of a
formal abstraction of a controller's execution in an environment, and leverages
probabilistic model checking of Markov decision processes to produce
probabilistic guarantees on safe behaviour over a finite time horizon. It
produces bounds on the probability of safe operation of the controller for
different initial configurations and identifies regions where correct behaviour
can be guaranteed. We implement and evaluate our approach on agents trained for
several benchmark control problems
Quantitative Approximation of the Probability Distribution of a Markov Process by Formal Abstractions
The goal of this work is to formally abstract a Markov process evolving in
discrete time over a general state space as a finite-state Markov chain, with
the objective of precisely approximating its state probability distribution in
time, which allows for its approximate, faster computation by that of the
Markov chain. The approach is based on formal abstractions and employs an
arbitrary finite partition of the state space of the Markov process, and the
computation of average transition probabilities between partition sets. The
abstraction technique is formal, in that it comes with guarantees on the
introduced approximation that depend on the diameters of the partitions: as
such, they can be tuned at will. Further in the case of Markov processes with
unbounded state spaces, a procedure for precisely truncating the state space
within a compact set is provided, together with an error bound that depends on
the asymptotic properties of the transition kernel of the original process. The
overall abstraction algorithm, which practically hinges on piecewise constant
approximations of the density functions of the Markov process, is extended to
higher-order function approximations: these can lead to improved error bounds
and associated lower computational requirements. The approach is practically
tested to compute probabilistic invariance of the Markov process under study,
and is compared to a known alternative approach from the literature.Comment: 29 pages, Journal of Logical Methods in Computer Scienc
Aggregation and Control of Populations of Thermostatically Controlled Loads by Formal Abstractions
This work discusses a two-step procedure, based on formal abstractions, to
generate a finite-space stochastic dynamical model as an aggregation of the
continuous temperature dynamics of a homogeneous population of Thermostatically
Controlled Loads (TCL). The temperature of a single TCL is described by a
stochastic difference equation and the TCL status (ON, OFF) by a deterministic
switching mechanism. The procedure is formal as it allows the exact
quantification of the error introduced by the abstraction -- as such it builds
and improves on a known, earlier approximation technique in the literature.
Further, the contribution discusses the extension to the case of a
heterogeneous population of TCL by means of two approaches resulting in the
notion of approximate abstractions. It moreover investigates the problem of
global (population-level) regulation and load balancing for the case of TCL
that are dependent on a control input. The procedure is tested on a case study
and benchmarked against the mentioned alternative approach in the literature.Comment: 40 pages, 21 figures; the paper generalizes the result of conference
publication: S. Esmaeil Zadeh Soudjani and A. Abate, "Aggregation of
Thermostatically Controlled Loads by Formal Abstractions," Proceedings of the
European Control Conference 2013, pp. 4232-4237. version 2: added references
for section
On the connections between PCTL and Dynamic Programming
Probabilistic Computation Tree Logic (PCTL) is a well-known modal logic which
has become a standard for expressing temporal properties of finite-state Markov
chains in the context of automated model checking. In this paper, we give a
definition of PCTL for noncountable-space Markov chains, and we show that there
is a substantial affinity between certain of its operators and problems of
Dynamic Programming. After proving some uniqueness properties of the solutions
to the latter, we conclude the paper with two examples to show that some
recovery strategies in practical applications, which are naturally stated as
reach-avoid problems, can be actually viewed as particular cases of PCTL
formulas.Comment: Submitte
On the Performance of Short Block Codes over Finite-State Channels in the Rare-Transition Regime
As the mobile application landscape expands, wireless networks are tasked
with supporting different connection profiles, including real-time traffic and
delay-sensitive communications. Among many ensuing engineering challenges is
the need to better understand the fundamental limits of forward error
correction in non-asymptotic regimes. This article characterizes the
performance of random block codes over finite-state channels and evaluates
their queueing performance under maximum-likelihood decoding. In particular,
classical results from information theory are revisited in the context of
channels with rare transitions, and bounds on the probabilities of decoding
failure are derived for random codes. This creates an analysis framework where
channel dependencies within and across codewords are preserved. Such results
are subsequently integrated into a queueing problem formulation. For instance,
it is shown that, for random coding on the Gilbert-Elliott channel, the
performance analysis based on upper bounds on error probability provides very
good estimates of system performance and optimum code parameters. Overall, this
study offers new insights about the impact of channel correlation on the
performance of delay-aware, point-to-point communication links. It also
provides novel guidelines on how to select code rates and block lengths for
real-time traffic over wireless communication infrastructures
- …