32 research outputs found
Lower bounds on the robustness to adversarial perturbations
The input-output mappings learned by state-of-the-art neural networks are significantly discontinuous. It is possible to cause a neural network used for image recognition to misclassify its input by applying very specific, hardly perceptible perturbations to the input, called adversarial perturbations. Many hypotheses have been proposed to explain the existence of these peculiar samples as well as several methods to mitigate them, but a proven explanation remains elusive. In this work, we take steps towards a formal characterization of adversarial perturbations by deriving lower bounds on the magnitudes of perturbations necessary to change the classification of neural networks. The proposed bounds can be computed efficiently, requiring time at most linear in the number of parameters and hyperparameters of the model for any given sample. This makes them suitable for use in model selection, when one wishes to find out which of several proposed classifiers is most robust to adversarial perturbations. They may also be used as a basis for developing techniques to increase the robustness of classifiers, since they enjoy the theoretical guarantee that no adversarial perturbation could possibly be any smaller than the quantities provided by the bounds. We experimentally verify the bounds on the MNIST and CIFAR-10 data sets and find no violations. Additionally, the experimental results suggest that very small adversarial perturbations may occur with non-zero probability on natural samples
Random deep neural networks are biased towards simple functions
We prove that the binary classifiers of bit strings generated by random wide
deep neural networks with ReLU activation function are biased towards simple
functions. The simplicity is captured by the following two properties. For any
given input bit string, the average Hamming distance of the closest input bit
string with a different classification is at least sqrt(n / (2{\pi} log n)),
where n is the length of the string. Moreover, if the bits of the initial
string are flipped randomly, the average number of flips required to change the
classification grows linearly with n. These results are confirmed by numerical
experiments on deep neural networks with two hidden layers, and settle the
conjecture stating that random deep neural networks are biased towards simple
functions. This conjecture was proposed and numerically explored in [Valle
P\'erez et al., ICLR 2019] to explain the unreasonably good generalization
properties of deep learning algorithms. The probability distribution of the
functions generated by random deep neural networks is a good choice for the
prior probability distribution in the PAC-Bayesian generalization bounds. Our
results constitute a fundamental step forward in the characterization of this
distribution, therefore contributing to the understanding of the generalization
properties of deep learning algorithms
Efficient Neural Network Robustness Certification with General Activation Functions
Finding minimum distortion of adversarial examples and thus certifying
robustness in neural network classifiers for given data points is known to be a
challenging problem. Nevertheless, recently it has been shown to be possible to
give a non-trivial certified lower bound of minimum adversarial distortion, and
some recent progress has been made towards this direction by exploiting the
piece-wise linear nature of ReLU activations. However, a generic robustness
certification for general activation functions still remains largely
unexplored. To address this issue, in this paper we introduce CROWN, a general
framework to certify robustness of neural networks with general activation
functions for given input data points. The novelty in our algorithm consists of
bounding a given activation function with linear and quadratic functions, hence
allowing it to tackle general activation functions including but not limited to
four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we
facilitate the search for a tighter certified lower bound by adaptively
selecting appropriate surrogates for each neuron activation. Experimental
results show that CROWN on ReLU networks can notably improve the certified
lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while
having comparable computational efficiency. Furthermore, CROWN also
demonstrates its effectiveness and flexibility on networks with general
activation functions, including tanh, sigmoid and arctan.Comment: Accepted by NIPS 2018. Huan Zhang and Tsui-Wei Weng contributed
equall
CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks
Verifying robustness of neural network classifiers has attracted great
interests and attention due to the success of deep neural networks and their
unexpected vulnerability to adversarial perturbations. Although finding minimum
adversarial distortion of neural networks (with ReLU activations) has been
shown to be an NP-complete problem, obtaining a non-trivial lower bound of
minimum distortion as a provable robustness guarantee is possible. However,
most previous works only focused on simple fully-connected layers (multilayer
perceptrons) and were limited to ReLU activations. This motivates us to propose
a general and efficient framework, CNN-Cert, that is capable of certifying
robustness on general convolutional neural networks. Our framework is general
-- we can handle various architectures including convolutional layers,
max-pooling layers, batch normalization layer, residual blocks, as well as
general activation functions; our approach is efficient -- by exploiting the
special structure of convolutional layers, we achieve up to 17 and 11 times of
speed-up compared to the state-of-the-art certification algorithms (e.g.
Fast-Lin, CROWN) and 366 times of speed-up compared to the dual-LP approach
while our algorithm obtains similar or even better verification bounds. In
addition, CNN-Cert generalizes state-of-the-art algorithms e.g. Fast-Lin and
CROWN. We demonstrate by extensive experiments that our method outperforms
state-of-the-art lower-bound-based certification algorithms in terms of both
bound quality and speed.Comment: Accepted by AAAI 201