1,843 research outputs found
Stylized Adversarial Defense
Deep Convolution Neural Networks (CNNs) can easily be fooled by subtle,
imperceptible changes to the input images. To address this vulnerability,
adversarial training creates perturbation patterns and includes them in the
training set to robustify the model. In contrast to existing adversarial
training methods that only use class-boundary information (e.g., using a cross
entropy loss), we propose to exploit additional information from the feature
space to craft stronger adversaries that are in turn used to learn a robust
model. Specifically, we use the style and content information of the target
sample from another class, alongside its class boundary information to create
adversarial perturbations. We apply our proposed multi-task objective in a
deeply supervised manner, extracting multi-scale feature knowledge to create
maximally separating adversaries. Subsequently, we propose a max-margin
adversarial training approach that minimizes the distance between source image
and its adversary and maximizes the distance between the adversary and the
target image. Our adversarial training approach demonstrates strong robustness
compared to state of the art defenses, generalizes well to naturally occurring
corruptions and data distributional shifts, and retains the model accuracy on
clean examples.Comment: Code is available at this http https://github.com/Muzammal-Naseer/SA
Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural
networks (AT-CNNs) recognize objects. We design systematic approaches to
interpret AT-CNNs in both qualitative and quantitative ways and compare them
with normally trained models. Surprisingly, we find that adversarial training
alleviates the texture bias of standard CNNs when trained on object recognition
tasks, and helps CNNs learn a more shape-biased representation. We validate our
hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and
standard CNNs on clean images and images under different transformations. The
comparison could visually show that the prediction of the two types of CNNs is
sensitive to dramatically different types of features. Second, to achieve
quantitative verification, we construct additional test datasets that destroy
either textures or shapes, such as style-transferred version of clean data,
saturated images and patch-shuffled ones, and then evaluate the classification
accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some
light on why AT-CNNs are more robust than those normally trained ones and
contribute to a better understanding of adversarial training over CNNs from an
interpretation perspective.Comment: To apper in ICML1
Improving the Transferability of Adversarial Examples with Arbitrary Style Transfer
Deep neural networks are vulnerable to adversarial examples crafted by
applying human-imperceptible perturbations on clean inputs. Although many
attack methods can achieve high success rates in the white-box setting, they
also exhibit weak transferability in the black-box setting. Recently, various
methods have been proposed to improve adversarial transferability, in which the
input transformation is one of the most effective methods. In this work, we
notice that existing input transformation-based works mainly adopt the
transformed data in the same domain for augmentation. Inspired by domain
generalization, we aim to further improve the transferability using the data
augmented from different domains. Specifically, a style transfer network can
alter the distribution of low-level visual features in an image while
preserving semantic content for humans. Hence, we propose a novel attack method
named Style Transfer Method (STM) that utilizes a proposed arbitrary style
transfer network to transform the images into different domains. To avoid
inconsistent semantic information of stylized images for the classification
network, we fine-tune the style transfer network and mix up the generated
images added by random noise with the original images to maintain semantic
consistency and boost input diversity. Extensive experimental results on the
ImageNet-compatible dataset show that our proposed method can significantly
improve the adversarial transferability on either normally trained models or
adversarially trained models than state-of-the-art input transformation-based
attacks. Code is available at: https://github.com/Zhijin-Ge/STM.Comment: 10 pages, 2 figures, accepted by the 31st ACM International
Conference on Multimedia (MM '23
Financial Development in Adversarial and Inquisitorial Legal Systems
This paper analyzes how the adversarial and inquisitorial evidence collection procedures affect financial development. In investigating the true returns of insolvent entrepreneurs, the adversarial procedure relies on lawyers whereas the inquisitorial procedure relies on judges. Investors are willing to lend more in adversarial than in inquisitorial legal systems if they are richer than entrepreneurs or if lawyers are more productive than judges. Manipulation of evidence by lawyers has an ambiguous impact on finance. The empirical evidence shows that a more inquisitorial procedure is associated with less developed financial markets.adversarial; inquisitorial; financial development; legal origins
Intriguing generalization and simplicity of adversarially trained neural networks
Adversarial training has been the topic of dozens of studies and a leading
method for defending against adversarial attacks. Yet, it remains unknown (a)
how adversarially-trained classifiers (a.k.a "robust" classifiers) generalize
to new types of out-of-distribution examples; and (b) what hidden
representations were learned by robust networks. In this paper, we perform a
thorough, systematic study to answer these two questions on AlexNet, GoogLeNet,
and ResNet-50 trained on ImageNet. While robust models often perform on-par or
worse than standard models on unseen distorted, texture-preserving images (e.g.
blurred), they are consistently more accurate on texture-less images (i.e.
silhouettes and stylized). That is, robust models rely heavily on shapes, in
stark contrast to the strong texture bias in standard ImageNet classifiers
(Geirhos et al. 2018). Remarkably, adversarial training causes three
significant shifts in the functions of hidden neurons. That is, each
convolutional neuron often changes to (1) detect pixel-wise smoother patterns;
(2) detect more lower-level features i.e. textures and colors (instead of
objects); and (3) be simpler in terms of complexity i.e. detecting more limited
sets of concepts
Enhance the Visual Representation via Discrete Adversarial Training
Adversarial Training (AT), which is commonly accepted as one of the most
effective approaches defending against adversarial examples, can largely harm
the standard performance, thus has limited usefulness on industrial-scale
production and applications. Surprisingly, this phenomenon is totally opposite
in Natural Language Processing (NLP) task, where AT can even benefit for
generalization. We notice the merit of AT in NLP tasks could derive from the
discrete and symbolic input space. For borrowing the advantage from NLP-style
AT, we propose Discrete Adversarial Training (DAT). DAT leverages VQGAN to
reform the image data to discrete text-like inputs, i.e. visual words. Then it
minimizes the maximal risk on such discrete images with symbolic adversarial
perturbations. We further give an explanation from the perspective of
distribution to demonstrate the effectiveness of DAT. As a plug-and-play
technique for enhancing the visual representation, DAT achieves significant
improvement on multiple tasks including image classification, object detection
and self-supervised learning. Especially, the model pre-trained with Masked
Auto-Encoding (MAE) and fine-tuned by our DAT without extra data can get 31.40
mCE on ImageNet-C and 32.77% top-1 accuracy on Stylized-ImageNet, building the
new state-of-the-art. The code will be available at
https://github.com/alibaba/easyrobust.Comment: Accepted to NeurIPS 2022, https://github.com/alibaba/easyrobus
- ā¦