Search CORE

265 research outputs found

Attacking Binarized Neural Networks

Author: Galloway Angus
Moussa Medhat
Taylor Graham W.
Publication venue
Publication date: 31/01/2018
Field of study

Neural networks with low-precision weights and activations offer compelling efficiency advantages over their full-precision equivalents. The two most frequently discussed benefits of quantization are reduced memory consumption, and a faster forward pass when implemented with efficient bitwise operations. We propose a third benefit of very low-precision neural networks: improved robustness against some adversarial attacks, and in the worst case, performance that is on par with full-precision models. We focus on the very low-precision case where weights and activations are both quantized to

\pm

1, and note that stochastically quantizing weights in just one layer can sharply reduce the impact of iterative attacks. We observe that non-scaled binary neural networks exhibit a similar effect to the original defensive distillation procedure that led to gradient masking, and a false notion of security. We address this by conducting both black-box and white-box experiments with binary models that do not artificially mask gradients.Comment: Published as a conference paper at ICLR 201

arXiv.org e-Print Archive

Combinatorial Attacks on Binarized Neural Networks

Author: Dilkina Bistra
Gupta Amrita
Khalil Elias B.
Publication venue
Publication date: 08/10/2018
Field of study

Binarized Neural Networks (BNNs) have recently attracted significant interest due to their computational efficiency. Concurrently, it has been shown that neural networks may be overly sensitive to "attacks" - tiny adversarial changes in the input - which may be detrimental to their use in safety-critical domains. Designing attack algorithms that effectively fool trained models is a key step towards learning robust neural networks. The discrete, non-differentiable nature of BNNs, which distinguishes them from their full-precision counterparts, poses a challenge to gradient-based attacks. In this work, we study the problem of attacking a BNN through the lens of combinatorial and integer optimization. We propose a Mixed Integer Linear Programming (MILP) formulation of the problem. While exact and flexible, the MILP quickly becomes intractable as the network and perturbation space grow. To address this issue, we propose IProp, a decomposition-based algorithm that solves a sequence of much smaller MILP problems. Experimentally, we evaluate both proposed methods against the standard gradient-based attack (FGSM) on MNIST and Fashion-MNIST, and show that IProp performs favorably compared to FGSM, while scaling beyond the limits of the MILP

arXiv.org e-Print Archive

Probabilistic Binary Neural Networks

Author: Peters Jorn W. T.
Welling Max
Publication venue
Publication date: 10/09/2018
Field of study

Low bit-width weights and activations are an effective way of combating the increasing need for both memory and compute power of Deep Neural Networks. In this work, we present a probabilistic training method for Neural Network with both binary weights and activations, called BLRNet. By embracing stochasticity during training, we circumvent the need to approximate the gradient of non-differentiable functions such as sign(), while still obtaining a fully Binary Neural Network at test time. Moreover, it allows for anytime ensemble predictions for improved performance and uncertainty estimates by sampling from the weight distribution. Since all operations in a layer of the BLRNet operate on random variables, we introduce stochastic versions of Batch Normalization and max pooling, which transfer well to a deterministic network at test time. We evaluate the BLRNet on multiple standardized benchmarks

arXiv.org e-Print Archive

Predicting Adversarial Examples with High Confidence

Author: Galloway Angus
Moussa Medhat
Taylor Graham W.
Publication venue
Publication date: 12/02/2018
Field of study

It has been suggested that adversarial examples cause deep learning models to make incorrect predictions with high confidence. In this work, we take the opposite stance: an overly confident model is more likely to be vulnerable to adversarial examples. This work is one of the most proactive approaches taken to date, as we link robustness with non-calibrated model confidence on noisy images, providing a data-augmentation-free path forward. The adversarial examples phenomenon is most easily explained by the trend of increasing non-regularized model capacity, while the diversity and number of samples in common datasets has remained flat. Test accuracy has incorrectly been associated with true generalization performance, ignoring that training and test splits are often extremely similar in terms of the overall representation space. The transferability property of adversarial examples was previously used as evidence against overfitting arguments, a perceived random effect, but overfitting is not always random.Comment: Under review by the International Conference on Machine Learning (ICML

arXiv.org e-Print Archive

SafetyNet: Detecting and Rejecting Adversarial Examples Robustly

Author: Forsyth David
Issaranon Theerasit
Lu Jiajun
Publication venue
Publication date: 15/08/2017
Field of study

We describe a method to produce a network where current methods such as DeepFool have great difficulty producing adversarial samples. Our construction suggests some insights into how deep networks work. We provide a reasonable analyses that our construction is difficult to defeat, and show experimentally that our method is hard to defeat with both Type I and Type II attacks using several standard networks and datasets. This SafetyNet architecture is used to an important and novel application SceneProof, which can reliably detect whether an image is a picture of a real scene or not. SceneProof applies to images captured with depth maps (RGBD images) and checks if a pair of image and depth map is consistent. It relies on the relative difficulty of producing naturalistic depth maps for images in post processing. We demonstrate that our SafetyNet is robust to adversarial examples built from currently known attacking approaches.Comment: Accepted to ICCV 201

arXiv.org e-Print Archive

Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Author: Behrmannn Jens
Carlini Nicholas
Jacobsen Jörn-Henrik
Papernot Nicolas
Tramèr Florian
Publication venue
Publication date: 25/03/2019
Field of study

Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected despite the underlying input's label having changed. In this paper, we demonstrate that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversarial examples. In addition to analytical constructions, we empirically study vision classifiers with state-of-the-art robustness to perturbation-based adversaries constrained by an

\ell_p

norm. We mount attacks that exploit excessive model invariance in directions relevant to the task, which are able to find adversarial examples within the

\ell_p

ball. In fact, we find that classifiers trained to be

\ell_p

-norm robust are more vulnerable to invariance-based adversarial examples than their undefended counterparts. Excessive invariance is not limited to models trained to be robust to perturbation-based

\ell_p

-norm adversaries. In fact, we argue that the term adversarial example is used to capture a series of model limitations, some of which may not have been discovered yet. Accordingly, we call for a set of precise definitions that taxonomize and address each of these shortcomings in learning.Comment: Accepted at the ICLR 2019 SafeML Worksho

arXiv.org e-Print Archive

Defending against substitute model black box adversarial attacks with the 01 loss

Author: Roshan Usman
Xie Meiyan
Xue Yunzhe
Publication venue
Publication date: 01/09/2020
Field of study

Substitute model black box attacks can create adversarial examples for a target model just by accessing its output labels. This poses a major challenge to machine learning models in practice, particularly in security sensitive applications. The 01 loss model is known to be more robust to outliers and noise than convex models that are typically used in practice. Motivated by these properties we present 01 loss linear and 01 loss dual layer neural network models as a defense against transfer based substitute model black box attacks. We compare the accuracy of adversarial examples from substitute model black box attacks targeting our 01 loss models and their convex counterparts for binary classification on popular image benchmarks. Our 01 loss dual layer neural network has an adversarial accuracy of 66.2%, 58%, 60.5%, and 57% on MNIST, CIFAR10, STL10, and ImageNet respectively whereas the sigmoid activated logistic loss counterpart has accuracies of 63.5%, 19.3%, 14.9%, and 27.6%. Except for MNIST the convex counterparts have substantially lower adversarial accuracies. We show practical applications of our models to deter traffic sign and facial recognition adversarial attacks. On GTSRB street sign and CelebA facial detection our 01 loss network has 34.6% and 37.1% adversarial accuracy respectively whereas the convex logistic counterpart has accuracy 24% and 1.9%. Finally we show that our 01 loss network can attain robustness on par with simple convolutional neural networks and much higher than its convex counterpart even when attacked with a convolutional network substitute model. Our work shows that 01 loss models offer a powerful defense against substitute model black box attacks.Comment: arXiv admin note: substantial text overlap with arXiv:2006.07800; text overlap with arXiv:2008.0914

arXiv.org e-Print Archive

ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples

Author: Cao Xiaochun
Foroosh Hassan
Jia Xiaojun
Wei Xingxing
Publication venue
Publication date: 01/07/2019
Field of study

Deep neural networks (DNNs) have been demonstrated to be vulnerable to adversarial examples. Specifically, adding imperceptible perturbations to clean images can fool the well trained deep neural networks. In this paper, we propose an end-to-end image compression model to defend adversarial examples: \textbf{ComDefend}. The proposed model consists of a compression convolutional neural network (ComCNN) and a reconstruction convolutional neural network (ResCNN). The ComCNN is used to maintain the structure information of the original image and purify adversarial perturbations. And the ResCNN is used to reconstruct the original image with high quality. In other words, ComDefend can transform the adversarial image to its clean version, which is then fed to the trained classifier. Our method is a pre-processing module, and does not modify the classifier's structure during the whole process. Therefore, it can be combined with other model-specific defense models to jointly improve the classifier's robustness. A series of experiments conducted on MNIST, CIFAR10 and ImageNet show that the proposed method outperforms the state-of-the-art defense methods, and is consistently effective to protect classifiers against adversarial attacks

arXiv.org e-Print Archive

Exploiting Verified Neural Networks via Floating Point Numerical Error

Author: Jia Kai
Rinard Martin
Publication venue
Publication date: 05/03/2020
Field of study

We show how to construct adversarial examples for neural networks with exactly verified robustness against

\ell_{\infty}

-bounded input perturbations by exploiting floating point error. We argue that any exact verification of real-valued neural networks must accurately model the implementation details of any floating point arithmetic used during inference or verification

arXiv.org e-Print Archive

Towards a Robust Deep Neural Network in Texts: A Survey

Author: Wang Lina
Wang Run
Wang Wenqi
Wang Zhibo
Ye Aoshuang
Publication venue
Publication date: 02/01/2020
Field of study

Deep neural networks (DNNs) have achieved remarkable success in various tasks (e.g., image classification, speech recognition, and natural language processing). However, researches have shown that DNN models are vulnerable to adversarial examples, which cause incorrect predictions by adding imperceptible perturbations into normal inputs. Studies on adversarial examples in image domain have been well investigated, but in texts the research is not enough, let alone a comprehensive survey in this field. In this paper, we aim at presenting a comprehensive understanding of adversarial attacks and corresponding mitigation strategies in texts. Specifically, we first give a taxonomy of adversarial attacks and defenses in texts from the perspective of different natural language processing (NLP) tasks, and then introduce how to build a robust DNN model via testing and verification. Finally, we discuss the existing challenges of adversarial attacks and defenses in texts and present the future research directions in this emerging field

arXiv.org e-Print Archive