Search CORE

4,792 research outputs found

Defense against Universal Adversarial Perturbations

Author: Akhtar Naveed
Liu Jian
Mian Ajmal
Publication venue
Publication date: 28/02/2018
Field of study

Recent advances in Deep Learning show the existence of image-agnostic quasi-imperceptible perturbations that when applied to `any' image can fool a state-of-the-art network classifier to change its prediction about the image label. These `Universal Adversarial Perturbations' pose a serious threat to the success of Deep Learning in practice. We present the first dedicated framework to effectively defend the networks against such perturbations. Our approach learns a Perturbation Rectifying Network (PRN) as `pre-input' layers to a targeted model, such that the targeted model needs no modification. The PRN is learned from real and synthetic image-agnostic perturbations, where an efficient method to compute the latter is also proposed. A perturbation detector is separately trained on the Discrete Cosine Transform of the input-output difference of the PRN. A query image is first passed through the PRN and verified by the detector. If a perturbation is detected, the output of the PRN is used for label prediction instead of the actual image. A rigorous evaluation shows that our framework can defend the network classifiers against unseen adversarial perturbations in the real-world scenarios with up to 97.5% success rate. The PRN also generalizes well in the sense that training for one targeted network defends another network with a comparable success rate.Comment: Accepted in IEEE CVPR 201

arXiv.org e-Print Archive

Crossref

Procedural Noise Adversarial Examples for Black-Box Attacks on Deep Convolutional Networks

Author: Athalye Anish
Barreno Marco
Bhagoji Arjun Nitin
Carlini Nicholas
Girshick Ross
Ilyas Andrew
Khrulkov Valentin
Kurakin Alexey
Mopuri Konda
Moćkus J
Rifai Salah
Szegedy Christian
Tramér Florian
Zhou Wen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/06/2019
Field of study

Deep Convolutional Networks (DCNs) have been shown to be vulnerable to adversarial examples---perturbed inputs specifically designed to produce intentional errors in the learning algorithms at test time. Existing input-agnostic adversarial perturbations exhibit interesting visual patterns that are currently unexplained. In this paper, we introduce a structured approach for generating Universal Adversarial Perturbations (UAPs) with procedural noise functions. Our approach unveils the systemic vulnerability of popular DCN models like Inception v3 and YOLO v3, with single noise patterns able to fool a model on up to 90% of the dataset. Procedural noise allows us to generate a distribution of UAPs with high universal evasion rates using only a few parameters. Additionally, we propose Bayesian optimization to efficiently learn procedural noise parameters to construct inexpensive untargeted black-box attacks. We demonstrate that it can achieve an average of less than 10 queries per successful attack, a 100-fold improvement on existing methods. We further motivate the use of input-agnostic defences to increase the stability of models to adversarial perturbations. The universality of our attacks suggests that DCN models may be sensitive to aggregations of low-level class-agnostic features. These findings give insight on the nature of some universal adversarial perturbations and how they could be generated in other applications.Comment: 16 pages, 10 figures. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (CCS '19

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Art of singular vectors and universal adversarial perturbations

Author: Khrulkov Valentin
Oseledets Ivan
Publication venue
Publication date: 19/11/2017
Field of study

Vulnerability of Deep Neural Networks (DNNs) to adversarial attacks has been attracting a lot of attention in recent studies. It has been shown that for many state of the art DNNs performing image classification there exist universal adversarial perturbations --- image-agnostic perturbations mere addition of which to natural images with high probability leads to their misclassification. In this work we propose a new algorithm for constructing such universal perturbations. Our approach is based on computing the so-called

(p, q)

-singular vectors of the Jacobian matrices of hidden layers of a network. Resulting perturbations present interesting visual patterns, and by using only 64 images we were able to construct universal perturbations with more than 60 \% fooling rate on the dataset consisting of 50000 images. We also investigate a correlation between the maximal singular value of the Jacobian matrix and the fooling rate of the corresponding singular vector, and show that the constructed perturbations generalize across networks.Comment: Submitted to CVPR 201

arXiv.org e-Print Archive

Crossref

Learning Universal Adversarial Perturbations with Generative Models

Author: Danezis George
Hayes Jamie
Publication venue
Publication date: 05/01/2018
Field of study

Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally perturbed to remain visually similar to the source input, but cause a misclassification. It was recently shown that given a dataset and classifier, there exists so called universal adversarial perturbations, a single perturbation that causes a misclassification when applied to any input. In this work, we introduce universal adversarial networks, a generative network that is capable of fooling a target classifier when it's generated output is added to a clean sample from a dataset. We show that this technique improves on known universal adversarial attacks

arXiv.org e-Print Archive

Crossref

UCL Discovery