87 research outputs found
Verifiable Reinforcement Learning via Policy Extraction
While deep reinforcement learning has successfully solved many challenging
control tasks, its real-world applicability has been limited by the inability
to ensure the safety of learned policies. We propose an approach to verifiable
reinforcement learning by training decision tree policies, which can represent
complex policies (since they are nonparametric), yet can be efficiently
verified using existing techniques (since they are highly structured). The
challenge is that decision tree policies are difficult to train. We propose
VIPER, an algorithm that combines ideas from model compression and imitation
learning to learn decision tree policies guided by a DNN policy (called the
oracle) and its Q-function, and show that it substantially outperforms two
baselines. We use VIPER to (i) learn a provably robust decision tree policy for
a variant of Atari Pong with a symbolic state space, (ii) learn a decision tree
policy for a toy game based on Pong that provably never loses, and (iii) learn
a provably stable decision tree policy for cart-pole. In each case, the
decision tree policy achieves performance equal to that of the original DNN
policy
Improving WCET Evaluation using Linear Relation Analysis
International audienceThe precision of a worst case execution time (WCET) evaluation tool on a given program is highly dependent on how the tool is able to detect and discard semantically infeasible executions of the program. In this paper, we propose to use the classical abstract interpretation-based method of linear relation analysis to discover and exploit relations between execution paths. For this purpose, we add auxiliary variables (counters) to the program to trace its execution paths. The results are easily incorporated in the classical workflow of a WCET evaluator, when the evaluator is based on the popular implicit path enumeration technique. We use existing tools-a WCET evaluator and a linear relation analyzer-to build and experiment a prototype implementation of this idea. * This work is supported by the French research fundation (ANR) as part of the W-SEPT project (ANR-12-INSE-0001
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications
Formal verification of deep reinforcement learning agents
Deep reinforcement learning has been successfully applied to many control tasks, but the application of such controllers in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in uncertain environments. In this thesis we develop novel verification techniques to give the user stronger guarantees over the performance of the trained agents that they would be able to obtain by testing, under different degrees and sources of uncertainty.
In particular, we tackle three different sources of uncertainty to the agent and offer different algorithms to provide strong guarantees to the user. The first one is input noise: sensors in the real world always provide imperfect data. The second source of uncertainty comes from the actuators: once an agent decides to take a specific action, faulty actuators and or hardware problems could still prevent the agent from acting upon the decisions given by the controller. The last source of uncertainty is the policy: the set of decisions the controller takes when operating in the environment. Agents may act probabilistically for a number of reasons, such as dealing with adversaries in a competitive environment or addressing partial observability of the environment.
In this thesis, we develop formal models of controllers executing under uncertainty, and propose new verification techniques based on abstract interpretation for their analysis. We cover different horizon lengths, i.e., the number of steps into the future that we analyse, and present methods for both finite-horizon and infinite-horizon verification. We perform both probabilistic and non-probabilistic analysis of the models constructed, depending on the methodology adopted. We implement and evaluate our methods on controllers trained for several benchmark control problems
- …