44 research outputs found
Formal Methods for Autonomous Systems
Formal methods refer to rigorous, mathematical approaches to system
development and have played a key role in establishing the correctness of
safety-critical systems. The main building blocks of formal methods are models
and specifications, which are analogous to behaviors and requirements in system
design and give us the means to verify and synthesize system behaviors with
formal guarantees.
This monograph provides a survey of the current state of the art on
applications of formal methods in the autonomous systems domain. We consider
correct-by-construction synthesis under various formulations, including closed
systems, reactive, and probabilistic settings. Beyond synthesizing systems in
known environments, we address the concept of uncertainty and bound the
behavior of systems that employ learning using formal methods. Further, we
examine the synthesis of systems with monitoring, a mitigation technique for
ensuring that once a system deviates from expected behavior, it knows a way of
returning to normalcy. We also show how to overcome some limitations of formal
methods themselves with learning. We conclude with future directions for formal
methods in reinforcement learning, uncertainty, privacy, explainability of
formal methods, and regulation and certification
Probabilistic Guarantees for Safe Deep Reinforcement Learning
Deep reinforcement learning has been successfully applied to many control
tasks, but the application of such agents in safety-critical scenarios has been
limited due to safety concerns. Rigorous testing of these controllers is
challenging, particularly when they operate in probabilistic environments due
to, for example, hardware faults or noisy sensors. We propose MOSAIC, an
algorithm for measuring the safety of deep reinforcement learning agents in
stochastic settings. Our approach is based on the iterative construction of a
formal abstraction of a controller's execution in an environment, and leverages
probabilistic model checking of Markov decision processes to produce
probabilistic guarantees on safe behaviour over a finite time horizon. It
produces bounds on the probability of safe operation of the controller for
different initial configurations and identifies regions where correct behaviour
can be guaranteed. We implement and evaluate our approach on agents trained for
several benchmark control problems
Stochastic Finite State Control of POMDPs with LTL Specifications
Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study
Decision-Making Under Uncertainty: Beyond Probabilities
This position paper reflects on the state-of-the-art in decision-making under
uncertainty. A classical assumption is that probabilities can sufficiently
capture all uncertainty in a system. In this paper, the focus is on the
uncertainty that goes beyond this classical interpretation, particularly by
employing a clear distinction between aleatoric and epistemic uncertainty. The
paper features an overview of Markov decision processes (MDPs) and extensions
to account for partial observability and adversarial behavior. These models
sufficiently capture aleatoric uncertainty but fail to account for epistemic
uncertainty robustly. Consequently, we present a thorough overview of so-called
uncertainty models that exhibit uncertainty in a more robust interpretation. We
show several solution techniques for both discrete and continuous models,
ranging from formal verification, over control-based abstractions, to
reinforcement learning. As an integral part of this paper, we list and discuss
several key challenges that arise when dealing with rich types of uncertainty
in a model-based fashion
Strengthening Deterministic Policies for POMDPs
The synthesis problem for partially observable Markov decision processes
(POMDPs) is to compute a policy that satisfies a given specification. Such
policies have to take the full execution history of a POMDP into account,
rendering the problem undecidable in general. A common approach is to use a
limited amount of memory and randomize over potential choices. Yet, this
problem is still NP-hard and often computationally intractable in practice. A
restricted problem is to use neither history nor randomization, yielding
policies that are called stationary and deterministic. Previous approaches to
compute such policies employ mixed-integer linear programming (MILP). We
provide a novel MILP encoding that supports sophisticated specifications in the
form of temporal logic constraints. It is able to handle an arbitrary number of
such specifications. Yet, randomization and memory are often mandatory to
achieve satisfactory policies. First, we extend our encoding to deliver a
restricted class of randomized policies. Second, based on the results of the
original MILP, we employ a preprocessing of the POMDP to encompass memory-based
decisions. The advantages of our approach over state-of-the-art POMDP solvers
lie (1) in the flexibility to strengthen simple deterministic policies without
losing computational tractability and (2) in the ability to enforce the
provable satisfaction of arbitrarily many specifications. The latter point
allows taking trade-offs between performance and safety aspects of typical
POMDP examples into account. We show the effectiveness of our method on a broad
range of benchmarks
A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making
The field of Sequential Decision Making (SDM) provides tools for solving
Sequential Decision Processes (SDPs), where an agent must make a series of
decisions in order to complete a task or achieve a goal. Historically, two
competing SDM paradigms have view for supremacy. Automated Planning (AP)
proposes to solve SDPs by performing a reasoning process over a model of the
world, often represented symbolically. Conversely, Reinforcement Learning (RL)
proposes to learn the solution of the SDP from data, without a world model, and
represent the learned knowledge subsymbolically. In the spirit of
reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods
for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques
that learn to plan) and for learning aspects of their structure (e.g., world
models, state invariants and landmarks). To the best of our knowledge, no other
review in the field provides the same scope. As an additional contribution, we
discuss what properties an ideal method for SDM should exhibit and argue that
neurosymbolic AI is the current approach which most closely resembles this
ideal method. Finally, we outline several proposals to advance the field of SDM
via the integration of symbolic and subsymbolic AI