5,589 research outputs found
Probabilistic Guarantees for Safe Deep Reinforcement Learning
Deep reinforcement learning has been successfully applied to many control
tasks, but the application of such agents in safety-critical scenarios has been
limited due to safety concerns. Rigorous testing of these controllers is
challenging, particularly when they operate in probabilistic environments due
to, for example, hardware faults or noisy sensors. We propose MOSAIC, an
algorithm for measuring the safety of deep reinforcement learning agents in
stochastic settings. Our approach is based on the iterative construction of a
formal abstraction of a controller's execution in an environment, and leverages
probabilistic model checking of Markov decision processes to produce
probabilistic guarantees on safe behaviour over a finite time horizon. It
produces bounds on the probability of safe operation of the controller for
different initial configurations and identifies regions where correct behaviour
can be guaranteed. We implement and evaluate our approach on agents trained for
several benchmark control problems
Model checking embedded system designs
We survey the basic principles behind the application of model checking to controller verification and synthesis. A promising development is the area of guided model checking, in which the state space search strategy of the model checking algorithm can be influenced to visit more interesting sets of states first. In particular, we discuss how model checking can be combined with heuristic cost functions to guide search strategies. Finally, we list a number of current research developments, especially in the area of reachability analysis for optimal control and related issues
Communicating Processes with Data for Supervisory Coordination
We employ supervisory controllers to safely coordinate high-level
discrete(-event) behavior of distributed components of complex systems.
Supervisory controllers observe discrete-event system behavior, make a decision
on allowed activities, and communicate the control signals to the involved
parties. Models of the supervisory controllers can be automatically synthesized
based on formal models of the system components and a formalization of the safe
coordination (control) requirements. Based on the obtained models, code
generation can be used to implement the supervisory controllers in software, on
a PLC, or an embedded (micro)processor. In this article, we develop a process
theory with data that supports a model-based systems engineering framework for
supervisory coordination. We employ communication to distinguish between the
different flows of information, i.e., observation and supervision, whereas we
employ data to specify the coordination requirements more compactly, and to
increase the expressivity of the framework. To illustrate the framework, we
remodel an industrial case study involving coordination of maintenance
procedures of a printing process of a high-tech Oce printer.Comment: In Proceedings FOCLASA 2012, arXiv:1208.432
Recommended from our members
Thunderstriking constraints with JUPITER
We present JUPITER, a tool for analysing multi-constrained systems. JUPITER was built to explore three basic ideas. First, how to use controller synthesis so as to find the exact conditions under which a particular constraint will be satisfied. Second, how to successively refine the models used for the controller synthesis so as to obtain a series of more easily understandable and more robust controllers. Last but not least, how to structure & explain the synthesised controllers and provide hints to designers for further optimisations through the use of machine learning techniques. Thus, JUPITER can help in the design and analysis of multi-constraint systems through the automatic synthesis of control logic for certain of the constraints and the aid it provides to designers for discovering further optimisations. The controllers it synthesises can be easily implemented on top of a standard real-time OS
Petri Games: Synthesis of Distributed Systems with Causal Memory
We present a new multiplayer game model for the interaction and the flow of
information in a distributed system. The players are tokens on a Petri net. As
long as the players move in independent parts of the net, they do not know of
each other; when they synchronize at a joint transition, each player gets
informed of the causal history of the other player. We show that for Petri
games with a single environment player and an arbitrary bounded number of
system players, deciding the existence of a safety strategy for the system
players is EXPTIME-complete.Comment: In Proceedings GandALF 2014, arXiv:1408.556
- ā¦