292 research outputs found
Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning
Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize
reward but do not have safety guarantees during the learning and deployment
phases. Although shielding with Linear Temporal Logic (LTL) is a promising
formal method to ensure safety in single-agent Reinforcement Learning (RL), it
results in conservative behaviors when scaling to multi-agent scenarios.
Additionally, it poses computational challenges for synthesizing shields in
complex multi-agent environments. This work introduces Model-based Dynamic
Shielding (MBDS) to support MARL algorithm design. Our algorithm synthesizes
distributive shields, which are reactive systems running in parallel with each
MARL agent, to monitor and rectify unsafe behaviors. The shields can
dynamically split, merge, and recompute based on agents' states. This design
enables efficient synthesis of shields to monitor agents in complex
environments without coordination overheads. We also propose an algorithm to
synthesize shields without prior knowledge of the dynamics model. The proposed
algorithm obtains an approximate world model by interacting with the
environment during the early stage of exploration, making our MBDS enjoy formal
safety guarantees with high probability. We demonstrate in simulations that our
framework can surpass existing baselines in terms of safety guarantees and
learning performance.Comment: Accepted in AAMAS 202
Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud
With the advent of cloud computing, organizations are nowadays able to react
rapidly to changing demands for computational resources. Not only individual
applications can be hosted on virtual cloud infrastructures, but also complete
business processes. This allows the realization of so-called elastic processes,
i.e., processes which are carried out using elastic cloud resources. Despite
the manifold benefits of elastic processes, there is still a lack of solutions
supporting them.
In this paper, we identify the state of the art of elastic Business Process
Management with a focus on infrastructural challenges. We conceptualize an
architecture for an elastic Business Process Management System and discuss
existing work on scheduling, resource allocation, monitoring, decentralized
coordination, and state management for elastic processes. Furthermore, we
present two representative elastic Business Process Management Systems which
are intended to counter these challenges. Based on our findings, we identify
open issues and outline possible research directions for the realization of
elastic processes and elastic Business Process Management.Comment: Please cite as: S. Schulte, C. Janiesch, S. Venugopal, I. Weber, and
P. Hoenisch (2015). Elastic Business Process Management: State of the Art and
Open Challenges for BPM in the Cloud. Future Generation Computer Systems,
Volume NN, Number N, NN-NN., http://dx.doi.org/10.1016/j.future.2014.09.00
GLAS: Global-to-Local Safe Autonomy Synthesis for Multi-Robot Motion Planning with End-to-End Learning
We present GLAS: Global-to- Local Autonomy Synthesis, a provably-safe, automated distributed policy generation for multi-robot motion planning. Our approach combines the advantage of centralized planning of avoiding local minima with the advantage of decentralized controllers of scalability and distributed computation. In particular, our synthesized policies only require relative state information of nearby neighbors and obstacles, and compute a provably-safe action. Our approach has three major components: i) we generate demonstration trajectories using a global planner and extract local observations from them, ii) we use deep imitation learning to learn a decentralized policy that can run efficiently online, and iii) we introduce a novel differentiable safety module to ensure collision-free operation, thereby allowing for end-to-end policy training. Our numerical experiments demonstrate that our policies have a 20% higher success rate than optimal reciprocal collision avoidance, ORCA, across a wide range of robot and obstacle densities. We demonstrate our method on an aerial swarm, executing the policy on low-end microcontrollers in real-time
Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning
A runtime assurance system (RTA) for a given plant enables the exercise of an
untrusted or experimental controller while assuring safety with a backup (or
safety) controller. The relevant computational design problem is to create a
logic that assures safety by switching to the safety controller as needed,
while maximizing some performance criteria, such as the utilization of the
untrusted controller. Existing RTA design strategies are well-known to be
overly conservative and, in principle, can lead to safety violations. In this
paper, we formulate the optimal RTA design problem and present a new approach
for solving it. Our approach relies on reward shaping and reinforcement
learning. It can guarantee safety and leverage machine learning technologies
for scalability. We have implemented this algorithm and present experimental
results comparing our approach with state-of-the-art reachability and
simulation-based RTA approaches in a number of scenarios using aircraft models
in 3D space with complex safety requirements. Our approach can guarantee safety
while increasing utilization of the experimental controller over existing
approaches
Formal Methods for Autonomous Systems
Formal methods refer to rigorous, mathematical approaches to system
development and have played a key role in establishing the correctness of
safety-critical systems. The main building blocks of formal methods are models
and specifications, which are analogous to behaviors and requirements in system
design and give us the means to verify and synthesize system behaviors with
formal guarantees.
This monograph provides a survey of the current state of the art on
applications of formal methods in the autonomous systems domain. We consider
correct-by-construction synthesis under various formulations, including closed
systems, reactive, and probabilistic settings. Beyond synthesizing systems in
known environments, we address the concept of uncertainty and bound the
behavior of systems that employ learning using formal methods. Further, we
examine the synthesis of systems with monitoring, a mitigation technique for
ensuring that once a system deviates from expected behavior, it knows a way of
returning to normalcy. We also show how to overcome some limitations of formal
methods themselves with learning. We conclude with future directions for formal
methods in reinforcement learning, uncertainty, privacy, explainability of
formal methods, and regulation and certification
Computer Aided Verification
This open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
Computer Aided Verification
This open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
- …