39 research outputs found
Geometric robustness of deep networks: analysis and improvement
Deep convolutional neural networks have been shown to be vulnerable to
arbitrary geometric transformations. However, there is no systematic method to
measure the invariance properties of deep networks to such transformations. We
propose ManiFool as a simple yet scalable algorithm to measure the invariance
of deep networks. In particular, our algorithm measures the robustness of deep
networks to geometric transformations in a worst-case regime as they can be
problematic for sensitive applications. Our extensive experimental results show
that ManiFool can be used to measure the invariance of fairly complex networks
on high dimensional datasets and these values can be used for analyzing the
reasons for it. Furthermore, we build on Manifool to propose a new adversarial
training scheme and we show its effectiveness on improving the invariance
properties of deep neural networks
How to choose your best allies for a transferable attack?
The transferability of adversarial examples is a key issue in the security of
deep neural networks. The possibility of an adversarial example crafted for a
source model fooling another targeted model makes the threat of adversarial
attacks more realistic. Measuring transferability is a crucial problem, but the
Attack Success Rate alone does not provide a sound evaluation. This paper
proposes a new methodology for evaluating transferability by putting distortion
in a central position. This new tool shows that transferable attacks may
perform far worse than a black box attack if the attacker randomly picks the
source model. To address this issue, we propose a new selection mechanism,
called FiT, which aims at choosing the best source model with only a few
preliminary queries to the target. Our experimental results show that FiT is
highly effective at selecting the best source model for multiple scenarios such
as single-model attacks, ensemble-model attacks and multiple attacks (Code
available at: https://github.com/t-maho/transferability_measure_fit)
The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training
Although current deep learning techniques have yielded superior performance
on various computer vision tasks, yet they are still vulnerable to adversarial
examples. Adversarial training and its variants have been shown to be the most
effective approaches to defend against adversarial examples. These methods
usually regularize the difference between output probabilities for an
adversarial and its corresponding natural example. However, it may have a
negative impact if the model misclassifies a natural example. To circumvent
this issue, we propose a novel adversarial training scheme that encourages the
model to produce similar outputs for an adversarial example and its ``inverse
adversarial'' counterpart. These samples are generated to maximize the
likelihood in the neighborhood of natural examples. Extensive experiments on
various vision datasets and architectures demonstrate that our training method
achieves state-of-the-art robustness as well as natural accuracy. Furthermore,
using a universal version of inverse adversarial examples, we improve the
performance of single-step adversarial training techniques at a low
computational cost