2,470 research outputs found
Probabilistic Guarantees for Safe Deep Reinforcement Learning
Deep reinforcement learning has been successfully applied to many control
tasks, but the application of such agents in safety-critical scenarios has been
limited due to safety concerns. Rigorous testing of these controllers is
challenging, particularly when they operate in probabilistic environments due
to, for example, hardware faults or noisy sensors. We propose MOSAIC, an
algorithm for measuring the safety of deep reinforcement learning agents in
stochastic settings. Our approach is based on the iterative construction of a
formal abstraction of a controller's execution in an environment, and leverages
probabilistic model checking of Markov decision processes to produce
probabilistic guarantees on safe behaviour over a finite time horizon. It
produces bounds on the probability of safe operation of the controller for
different initial configurations and identifies regions where correct behaviour
can be guaranteed. We implement and evaluate our approach on agents trained for
several benchmark control problems
Probabilistic Program Abstractions
Abstraction is a fundamental tool for reasoning about complex systems.
Program abstraction has been utilized to great effect for analyzing
deterministic programs. At the heart of program abstraction is the relationship
between a concrete program, which is difficult to analyze, and an abstract
program, which is more tractable. Program abstractions, however, are typically
not probabilistic. We generalize non-deterministic program abstractions to
probabilistic program abstractions by explicitly quantifying the
non-deterministic choices. Our framework upgrades key definitions and
properties of abstractions to the probabilistic context. We also discuss
preliminary ideas for performing inference on probabilistic abstractions and
general probabilistic programs
Tools and Algorithms for the Construction and Analysis of Systems
This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers
Reasoning with Latent Diffusion in Offline Reinforcement Learning
Offline reinforcement learning (RL) holds promise as a means to learn
high-reward policies from a static dataset, without the need for further
environment interactions. However, a key challenge in offline RL lies in
effectively stitching portions of suboptimal trajectories from the static
dataset while avoiding extrapolation errors arising due to a lack of support in
the dataset. Existing approaches use conservative methods that are tricky to
tune and struggle with multi-modal data (as we show) or rely on noisy Monte
Carlo return-to-go samples for reward conditioning. In this work, we propose a
novel approach that leverages the expressiveness of latent diffusion to model
in-support trajectory sequences as compressed latent skills. This facilitates
learning a Q-function while avoiding extrapolation error via
batch-constraining. The latent space is also expressive and gracefully copes
with multi-modal data. We show that the learned temporally-abstract latent
space encodes richer task-specific information for offline RL tasks as compared
to raw state-actions. This improves credit assignment and facilitates faster
reward propagation during Q-learning. Our method demonstrates state-of-the-art
performance on the D4RL benchmarks, particularly excelling in long-horizon,
sparse-reward tasks
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications
- …