12 research outputs found
RoMA: a Method for Neural Network Robustness Measurement and Assessment
Neural network models have become the leading solution for a large variety of
tasks, such as classification, language processing, protein folding, and
others. However, their reliability is heavily plagued by adversarial inputs:
small input perturbations that cause the model to produce erroneous outputs.
Adversarial inputs can occur naturally when the system's environment behaves
randomly, even in the absence of a malicious adversary, and are a severe cause
for concern when attempting to deploy neural networks within critical systems.
In this paper, we present a new statistical method, called Robustness
Measurement and Assessment (RoMA), which can measure the expected robustness of
a neural network model. Specifically, RoMA determines the probability that a
random input perturbation might cause misclassification. The method allows us
to provide formal guarantees regarding the expected frequency of errors that a
trained model will encounter after deployment. Our approach can be applied to
large-scale, black-box neural networks, which is a significant advantage
compared to recently proposed verification methods. We apply our approach in
two ways: comparing the robustness of different models, and measuring how a
model's robustness is affected by the magnitude of input perturbation. One
interesting insight obtained through this work is that, in a classification
network, different output labels can exhibit very different robustness levels.
We term this phenomenon categorial robustness. Our ability to perform risk and
robustness assessments on a categorial basis opens the door to risk mitigation,
which may prove to be a significant step towards neural network certification
in safety-critical applications
Statistical Guarantees for the Robustness of Bayesian Neural Networks
We introduce a probabilistic robustness measure for Bayesian Neural Networks
(BNNs), defined as the probability that, given a test point, there exists a
point within a bounded set such that the BNN prediction differs between the
two. Such a measure can be used, for instance, to quantify the probability of
the existence of adversarial examples. Building on statistical verification
techniques for probabilistic models, we develop a framework that allows us to
estimate probabilistic robustness for a BNN with statistical guarantees, i.e.,
with a priori error and confidence bounds. We provide experimental comparison
for several approximate BNN inference techniques on image classification tasks
associated to MNIST and a two-class subset of the GTSRB dataset. Our results
enable quantification of uncertainty of BNN predictions in adversarial
settings.Comment: 9 pages, 6 figure
Data-Driven Assessment of Deep Neural Networks with Random Input Uncertainty
When using deep neural networks to operate safety-critical systems, assessing
the sensitivity of the network outputs when subject to uncertain inputs is of
paramount importance. Such assessment is commonly done using reachability
analysis or robustness certification. However, certification techniques
typically ignore localization information, while reachable set methods can fail
to issue robustness guarantees. Furthermore, many advanced methods are either
computationally intractable in practice or restricted to very specific models.
In this paper, we develop a data-driven optimization-based method capable of
simultaneously certifying the safety of network outputs and localizing them.
The proposed method provides a unified assessment framework, as it subsumes
state-of-the-art reachability analysis and robustness certification. The method
applies to deep neural networks of all sizes and structures, and to random
input uncertainty with a general distribution. We develop sufficient conditions
for the convexity of the underlying optimization, and for the number of data
samples to certify and localize the outputs with overwhelming probability. We
experimentally demonstrate the efficacy and tractability of the method on a
deep ReLU network
Probabilistic Safety for Bayesian Neural Networks
We study probabilistic safety for Bayesian Neural Networks (BNNs) under
adversarial input perturbations. Given a compact set of input points, , we study the probability w.r.t. the BNN posterior that
all the points in are mapped to the same region in the output space. In
particular, this can be used to evaluate the probability that a network sampled
from the BNN is vulnerable to adversarial attacks. We rely on relaxation
techniques from non-convex optimization to develop a method for computing a
lower bound on probabilistic safety for BNNs, deriving explicit procedures for
the case of interval and linear function propagation techniques. We apply our
methods to BNNs trained on a regression task, airborne collision avoidance, and
MNIST, empirically showing that our approach allows one to certify
probabilistic safety of BNNs with millions of parameters.Comment: UAI 2020; 13 pages, 5 figures, 1 tabl