15 research outputs found
Scalable Verification of Quantized Neural Networks (Technical Report)
Formal verification of neural networks is an active topic of research, and
recent advances have significantly increased the size of the networks that
verification tools can handle. However, most methods are designed for
verification of an idealized model of the actual network which works over real
arithmetic and ignores rounding imprecisions. This idealization is in stark
contrast to network quantization, which is a technique that trades numerical
precision for computational efficiency and is, therefore, often applied in
practice. Neglecting rounding errors of such low-bit quantized neural networks
has been shown to lead to wrong conclusions about the network's correctness.
Thus, the desired approach for verifying quantized neural networks would be one
that takes these rounding errors into account. In this paper, we show that
verifying the bit-exact implementation of quantized neural networks with
bit-vector specifications is PSPACE-hard, even though verifying idealized
real-valued networks and satisfiability of bit-vector specifications alone are
each in NP. Furthermore, we explore several practical heuristics toward closing
the complexity gap between idealized and bit-exact verification. In particular,
we propose three techniques for making SMT-based verification of quantized
neural networks more scalable. Our experiments demonstrate that our proposed
methods allow a speedup of up to three orders of magnitude over existing
approaches
Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees
We study the problem of learning controllers for discrete-time non-linear
stochastic dynamical systems with formal reach-avoid guarantees. This work
presents the first method for providing formal reach-avoid guarantees, which
combine and generalize stability and safety guarantees, with a tolerable
probability threshold over the infinite time horizon. Our method
leverages advances in machine learning literature and it represents formal
certificates as neural networks. In particular, we learn a certificate in the
form of a reach-avoid supermartingale (RASM), a novel notion that we introduce
in this work. Our RASMs provide reachability and avoidance guarantees by
imposing constraints on what can be viewed as a stochastic extension of level
sets of Lyapunov functions for deterministic systems. Our approach solves
several important problems -- it can be used to learn a control policy from
scratch, to verify a reach-avoid specification for a fixed control policy, or
to fine-tune a pre-trained policy if it does not satisfy the reach-avoid
specification. We validate our approach on stochastic non-linear
reinforcement learning tasks.Comment: Accepted at AAAI 202
Reachability Poorman Discrete-Bidding Games
We consider {\em bidding games}, a class of two-player zero-sum {\em graph
games}. The game proceeds as follows. Both players have bounded budgets. A
token is placed on a vertex of a graph, in each turn the players simultaneously
submit bids, and the higher bidder moves the token, where we break bidding ties
in favor of Player 1. Player 1 wins the game iff the token visits a designated
target vertex. We consider, for the first time, {\em poorman discrete-bidding}
in which the granularity of the bids is restricted and the higher bid is paid
to the bank. Previous work either did not impose granularity restrictions or
considered {\em Richman} bidding (bids are paid to the opponent). While the
latter mechanisms are technically more accessible, the former is more appealing
from a practical standpoint. Our study focuses on {\em threshold budgets},
which is the necessary and sufficient initial budget required for Player 1 to
ensure winning against a given Player 2 budget. We first show existence of
thresholds. In DAGs, we show that threshold budgets can be approximated with
error bounds by thresholds under continuous-bidding and that they exhibit a
periodic behavior. We identify closed-form solutions in special cases. We
implement and experiment with an algorithm to find threshold budgets.Comment: The full version of a paper published at ECAI 202
Learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic Systems
We consider the problem of learning control policies in discrete-time
stochastic systems which guarantee that the system stabilizes within some
specified stabilization region with probability~. Our approach is based on
the novel notion of stabilizing ranking supermartingales (sRSMs) that we
introduce in this work. Our sRSMs overcome the limitation of methods proposed
in previous works whose applicability is restricted to systems in which the
stabilizing region cannot be left once entered under any control policy. We
present a learning procedure that learns a control policy together with an sRSM
that formally certifies probability~ stability, both learned as neural
networks. We show that this procedure can also be adapted to formally verifying
that, under a given Lipschitz continuous control policy, the stochastic system
stabilizes within some stabilizing region with probability~. Our
experimental evaluation shows that our learning procedure can successfully
learn provably stabilizing policies in practice.Comment: Accepted at ATVA 2023. Follow-up work of arXiv:2112.0949
Bidding mechanisms in graph games
A graph game proceeds as follows: two players move a token through a graph to produce a finite or infinite path, which determines the payoff of the game. We study bidding games in which in each turn, an auction determines which player moves the token. Bidding games were largely studied in combination with two variants of first-price auctions called “Richman” and “poorman” bidding. We study taxman bidding, which span the spectrum between the two. The game is parameterized by a constant : portion τ of the winning bid is paid to the other player, and portion to the bank. While finite-duration (reachability) taxman games have been studied before, we present, for the first time, results on infinite-duration taxman games: we unify, generalize, and simplify previous equivalences between bidding games and a class of stochastic games called random-turn games
MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives.
Markov decision processes can be viewed as transformers of probability distributions. While this view is useful from a practical standpoint to reason about trajectories of distributions, basic reachability and safety problems are known to be computationally intractable (i.e., Skolem-hard) to solve in such models. Further, we show that even for simple examples of MDPs, strategies for safety objectives over distributions can require infinite memory and randomization. In light of this, we present a novel overapproximation approach to synthesize strategies in an MDP, such that a safety objective over the distributions is met. More precisely, we develop a new framework for template-based synthesis of certificates as affine distributional and inductive invariants for safety objectives in MDPs. We provide two algorithms within this framework. One can only synthesize memoryless strategies, but has relative completeness guarantees, while the other can synthesize general strategies. The runtime complexity of both algorithms is in PSPACE. We implement these algorithms and show that they can solve several non-trivial examples