Search CORE

1,843 research outputs found

Stylized Adversarial Defense

Author: Hayat Munawar
Khan Fahad Shahbaz
Khan Salman
Naseer Muzammal
Porikli Fatih
Publication venue
Publication date: 29/07/2020
Field of study

Deep Convolution Neural Networks (CNNs) can easily be fooled by subtle, imperceptible changes to the input images. To address this vulnerability, adversarial training creates perturbation patterns and includes them in the training set to robustify the model. In contrast to existing adversarial training methods that only use class-boundary information (e.g., using a cross entropy loss), we propose to exploit additional information from the feature space to craft stronger adversaries that are in turn used to learn a robust model. Specifically, we use the style and content information of the target sample from another class, alongside its class boundary information to create adversarial perturbations. We apply our proposed multi-task objective in a deeply supervised manner, extracting multi-scale feature knowledge to create maximally separating adversaries. Subsequently, we propose a max-margin adversarial training approach that minimizes the distance between source image and its adversary and maximizes the distance between the adversary and the target image. Our adversarial training approach demonstrates strong robustness compared to state of the art defenses, generalizes well to naturally occurring corruptions and data distributional shifts, and retains the model accuracy on clean examples.Comment: Code is available at this http https://github.com/Muzammal-Naseer/SA

arXiv.org e-Print Archive

Interpreting Adversarially Trained Convolutional Neural Networks

Author: Zhang Tianyuan
Zhu Zhanxing
Publication venue
Publication date: 01/01/2019
Field of study

We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation. We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and images under different transformations. The comparison could visually show that the prediction of the two types of CNNs is sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.Comment: To apper in ICML1

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Improving the Transferability of Adversarial Examples with Arbitrary Style Transfer

Author: Feng Wei
Ge Zhijin
Liu Hongying
Liu Yuanyuan
Shang Fanhua
Wan Liang
Wang Xiaosen
Publication venue
Publication date: 21/08/2023
Field of study

Deep neural networks are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on clean inputs. Although many attack methods can achieve high success rates in the white-box setting, they also exhibit weak transferability in the black-box setting. Recently, various methods have been proposed to improve adversarial transferability, in which the input transformation is one of the most effective methods. In this work, we notice that existing input transformation-based works mainly adopt the transformed data in the same domain for augmentation. Inspired by domain generalization, we aim to further improve the transferability using the data augmented from different domains. Specifically, a style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans. Hence, we propose a novel attack method named Style Transfer Method (STM) that utilizes a proposed arbitrary style transfer network to transform the images into different domains. To avoid inconsistent semantic information of stylized images for the classification network, we fine-tune the style transfer network and mix up the generated images added by random noise with the original images to maintain semantic consistency and boost input diversity. Extensive experimental results on the ImageNet-compatible dataset show that our proposed method can significantly improve the adversarial transferability on either normally trained models or adversarially trained models than state-of-the-art input transformation-based attacks. Code is available at: https://github.com/Zhijin-Ge/STM.Comment: 10 pages, 2 figures, accepted by the 31st ACM International Conference on Multimedia (MM '23

arXiv.org e-Print Archive

Financial Development in Adversarial and Inquisitorial Legal Systems

Author: Massenot Baptiste
Publication venue
Publication date
Field of study

This paper analyzes how the adversarial and inquisitorial evidence collection procedures affect financial development. In investigating the true returns of insolvent entrepreneurs, the adversarial procedure relies on lawyers whereas the inquisitorial procedure relies on judges. Investors are willing to lend more in adversarial than in inquisitorial legal systems if they are richer than entrepreneurs or if lawyers are more productive than judges. Manipulation of evidence by lawyers has an ambiguous impact on finance. The empirical evidence shows that a more inquisitorial procedure is associated with less developed financial markets.adversarial; inquisitorial; financial development; legal origins

Research Papers in Economics

Intriguing generalization and simplicity of adversarially trained neural networks

Author: Agarwal Chirag
Chen Peijie
Nguyen Anh
Publication venue
Publication date: 16/06/2020
Field of study

Adversarial training has been the topic of dozens of studies and a leading method for defending against adversarial attacks. Yet, it remains unknown (a) how adversarially-trained classifiers (a.k.a "robust" classifiers) generalize to new types of out-of-distribution examples; and (b) what hidden representations were learned by robust networks. In this paper, we perform a thorough, systematic study to answer these two questions on AlexNet, GoogLeNet, and ResNet-50 trained on ImageNet. While robust models often perform on-par or worse than standard models on unseen distorted, texture-preserving images (e.g. blurred), they are consistently more accurate on texture-less images (i.e. silhouettes and stylized). That is, robust models rely heavily on shapes, in stark contrast to the strong texture bias in standard ImageNet classifiers (Geirhos et al. 2018). Remarkably, adversarial training causes three significant shifts in the functions of hidden neurons. That is, each convolutional neuron often changes to (1) detect pixel-wise smoother patterns; (2) detect more lower-level features i.e. textures and colors (instead of objects); and (3) be simpler in terms of complexity i.e. detecting more limited sets of concepts

arXiv.org e-Print Archive

Enhance the Visual Representation via Discrete Adversarial Training

Author: Chen Yuefeng
Duan Ranjie
Li Xiaodan
Mao Xiaofeng
Qi Gege
Xue Hui
Ye Shaokai
Zhang Rong
Zhu Yao
Publication venue
Publication date: 16/09/2022
Field of study

Adversarial Training (AT), which is commonly accepted as one of the most effective approaches defending against adversarial examples, can largely harm the standard performance, thus has limited usefulness on industrial-scale production and applications. Surprisingly, this phenomenon is totally opposite in Natural Language Processing (NLP) task, where AT can even benefit for generalization. We notice the merit of AT in NLP tasks could derive from the discrete and symbolic input space. For borrowing the advantage from NLP-style AT, we propose Discrete Adversarial Training (DAT). DAT leverages VQGAN to reform the image data to discrete text-like inputs, i.e. visual words. Then it minimizes the maximal risk on such discrete images with symbolic adversarial perturbations. We further give an explanation from the perspective of distribution to demonstrate the effectiveness of DAT. As a plug-and-play technique for enhancing the visual representation, DAT achieves significant improvement on multiple tasks including image classification, object detection and self-supervised learning. Especially, the model pre-trained with Masked Auto-Encoding (MAE) and fine-tuned by our DAT without extra data can get 31.40 mCE on ImageNet-C and 32.77% top-1 accuracy on Stylized-ImageNet, building the new state-of-the-art. The code will be available at https://github.com/alibaba/easyrobust.Comment: Accepted to NeurIPS 2022, https://github.com/alibaba/easyrobus

arXiv.org e-Print Archive