11 research outputs found
Adversarial Robustness Guarantees for Random Deep Neural Networks
The reliability of deep learning algorithms is fundamentally challenged by
the existence of adversarial examples, which are incorrectly classified inputs
that are extremely close to a correctly classified input. We explore the
properties of adversarial examples for deep neural networks with random weights
and biases, and prove that for any , the distance of any given
input from the classification boundary scales as one over the square root of
the dimension of the input times the norm of the input. The results
are based on the recently proved equivalence between Gaussian processes and
deep neural networks in the limit of infinite width of the hidden layers, and
are validated with experiments on both random deep neural networks and deep
neural networks trained on the MNIST and CIFAR10 datasets. The results
constitute a fundamental advance in the theoretical understanding of
adversarial examples, and open the way to a thorough theoretical
characterization of the relation between network architecture and robustness to
adversarial perturbations
Probabilistic Safety for Bayesian Neural Networks
We study probabilistic safety for Bayesian Neural Networks (BNNs) under
adversarial input perturbations. Given a compact set of input points, , we study the probability w.r.t. the BNN posterior that
all the points in are mapped to the same region in the output space. In
particular, this can be used to evaluate the probability that a network sampled
from the BNN is vulnerable to adversarial attacks. We rely on relaxation
techniques from non-convex optimization to develop a method for computing a
lower bound on probabilistic safety for BNNs, deriving explicit procedures for
the case of interval and linear function propagation techniques. We apply our
methods to BNNs trained on a regression task, airborne collision avoidance, and
MNIST, empirically showing that our approach allows one to certify
probabilistic safety of BNNs with millions of parameters.Comment: UAI 2020; 13 pages, 5 figures, 1 tabl
PopSkipJump: Decision-Based Attack for Probabilistic Classifiers
Most current classifiers are vulnerable to adversarial examples, small input
perturbations that change the classification output. Many existing attack
algorithms cover various settings, from white-box to black-box classifiers, but
typically assume that the answers are deterministic and often fail when they
are not. We therefore propose a new adversarial decision-based attack
specifically designed for classifiers with probabilistic outputs. It is based
on the HopSkipJump attack by Chen et al. (2019, arXiv:1904.02144v5 ), a strong
and query efficient decision-based attack originally designed for deterministic
classifiers. Our P(robabilisticH)opSkipJump attack adapts its amount of queries
to maintain HopSkipJump's original output quality across various noise levels,
while converging to its query efficiency as the noise level decreases. We test
our attack on various noise models, including state-of-the-art off-the-shelf
randomized defenses, and show that they offer almost no extra robustness to
decision-based attacks. Code is available at
https://github.com/cjsg/PopSkipJump .Comment: ICML'21. Code available at https://github.com/cjsg/PopSkipJump . 9
pages & 7 figures in main part, 14 pages & 10 figures in appendi
Adversarial vulnerability bounds for Gaussian process classification
Protecting ML classifiers from adversarial examples is crucial. We propose that the main threat is an attacker perturbing a confidently classified input to produce a confident misclassification. We consider in this paper the L0 attack in which a small number of inputs can be perturbed by the attacker at test-time. To quantify the risk of this form of attack we have devised a formal guarantee in the form of an adversarial bound (AB) for a binary, Gaussian process classifier using the EQ kernel. This bound holds for the entire input domain, bounding the potential of any future adversarial attack to cause a confident misclassification. We explore how to extend to other kernels and investigate how to maximise the bound by altering the classifier (for example by using sparse approximations). We test the bound using a variety of datasets and show that it produces relevant and practical bounds for many of them
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review
Context: Machine Learning (ML) has been at the heart of many innovations over
the past years. However, including it in so-called 'safety-critical' systems
such as automotive or aeronautic has proven to be very challenging, since the
shift in paradigm that ML brings completely changes traditional certification
approaches.
Objective: This paper aims to elucidate challenges related to the
certification of ML-based safety-critical systems, as well as the solutions
that are proposed in the literature to tackle them, answering the question 'How
to Certify Machine Learning Based Safety-critical Systems?'.
Method: We conduct a Systematic Literature Review (SLR) of research papers
published between 2015 to 2020, covering topics related to the certification of
ML systems. In total, we identified 217 papers covering topics considered to be
the main pillars of ML certification: Robustness, Uncertainty, Explainability,
Verification, Safe Reinforcement Learning, and Direct Certification. We
analyzed the main trends and problems of each sub-field and provided summaries
of the papers extracted.
Results: The SLR results highlighted the enthusiasm of the community for this
subject, as well as the lack of diversity in terms of datasets and type of
models. It also emphasized the need to further develop connections between
academia and industries to deepen the domain study. Finally, it also
illustrated the necessity to build connections between the above mention main
pillars that are for now mainly studied separately.
Conclusion: We highlighted current efforts deployed to enable the
certification of ML based software systems, and discuss some future research
directions.Comment: 60 pages (92 pages with references and complements), submitted to a
journal (Automated Software Engineering). Changes: Emphasizing difference
traditional software engineering / ML approach. Adding Related Works, Threats
to Validity and Complementary Materials. Adding a table listing papers
reference for each section/subsection
Robustness Guarantees for Bayesian Inference with Gaussian Processes
Bayesian inference and Gaussian processes are widely used in applications ranging from robotics and control to biological systems. Many of these applications are safety-critical and require a characterization of the uncertainty associated with the learning model and formal guarantees on its predictions. In this paper we define a robustness measure for Bayesian inference against input perturbations, given by the probability that, for a test point and a compact set in the input space containing the test point, the prediction of the learning model will remain δ−close for all the points in the set, for δ > 0. Such measures can be used to provide formal probabilistic guarantees for the absence of adversarial examples. By employing the theory of Gaussian processes, we derive upper bounds on the resulting robustness by utilising the Borell-TIS inequality, and propose algorithms for their computation. We evaluate our techniques on two examples, a GP regression problem and a fully-connected deep neural network, where we rely on weak convergence to GPs to study adversarial examples on the MNIST dataset