7,564 research outputs found
SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness
We introduce SPLASH units, a class of learnable activation functions shown to
simultaneously improve the accuracy of deep neural networks while also
improving their robustness to adversarial attacks. SPLASH units have both a
simple parameterization and maintain the ability to approximate a wide range of
non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0);
3) use symmetric hinges; and 4) the locations of the hinges are derived
directly from the data (i.e. no learning required). Compared to nine other
learned and fixed activation functions, including ReLU and its variants, SPLASH
units show superior performance across three datasets (MNIST, CIFAR-10, and
CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and
Network-in-Network). Furthermore, we show that SPLASH units significantly
increase the robustness of deep neural networks to adversarial attacks. Our
experiments on both black-box and open-box adversarial attacks show that
commonly-used architectures, namely LeNet5, All-CNN, ResNet-20, and
Network-in-Network, can be up to 31% more robust to adversarial attacks by
simply using SPLASH units instead of ReLUs
SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness
We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0 ); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet
(Compress and Restore)^N: a Robust Defense Against Adversarial Attacks on Image Classification
Modern image classification approaches often rely on deep neural networks, which have shown pronounced weakness to
adversarial examples: images corrupted with specifically designed yet imperceptible noise that causes the network to misclassify.
In this paper, we propose a conceptually simple yet robust solution to tackle adversarial attacks on image classification. Our
defense works by first applying a JPEG compression with a random quality factor; compression artifacts are subsequently
removed by means of a generative model (AR-GAN). The process can be iterated ensuring the image is not degraded and hence
the classification not compromised. We train different AR-GANs for different compression factors, so that we can change its
parameters dynamically at each iteration depending on the current compression, making the gradient approximation difficult.
We experiment our defense against three white-box and two black-box attacks, with a particular focus on the state-of-the-art
BPDA attack. Our method does not require any adversarial training, and is independent of both the classifier and the attack.
Experiments demonstrate that dynamically changing the AR-GAN parameters is of fundamental importance to obtain significant
robustness
Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks
Deep neural networks are vulnerable to adversarial attacks, which can fool
them by adding minuscule perturbations to the input images. The robustness of
existing defenses suffers greatly under white-box attack settings, where an
adversary has full knowledge about the network and can iterate several times to
find strong perturbations. We observe that the main reason for the existence of
such perturbations is the close proximity of different class samples in the
learned feature space. This allows model decisions to be totally changed by
adding an imperceptible perturbation in the inputs. To counter this, we propose
to class-wise disentangle the intermediate feature representations of deep
networks. Specifically, we force the features for each class to lie inside a
convex polytope that is maximally separated from the polytopes of other
classes. In this manner, the network is forced to learn distinct and distant
decision regions for each class. We observe that this simple constraint on the
features greatly enhances the robustness of learned models, even against the
strongest white-box attacks, without degrading the classification performance
on clean images. We report extensive evaluations in both black-box and
white-box attack scenarios and show significant gains in comparison to
state-of-the art defenses.Comment: Accepted at ICCV 201
- …