Search CORE

6,222 research outputs found

Two approaches to defend against adversarial examples: Attention-based and Certificate-based

Author: Nguyen Chanh
Publication venue: Lehigh Preserve
Publication date
Field of study

In this paper, we present two different novel approaches to defend against adversarial examples in neural networks: attention-based against pixel-based attack and certificate-based against spatially transformed attack. We discuss the vulnerability of neural networks for adversarial examples, which significantly hinders their application in security-critical domains. We detail several popular pixel-based methods of attacking a model. We then walk through current defense methods and note that they can often be circumvented by adaptive adversaries. For the first contribution, we take a completely different route by leveraging the definition of adversarial inputs: while deceiving for deep neural networks, they are barely discernible for human visions. Building upon recent advances in interpretable models, we construct a new detection framework that contrasts an input’s interpretation against its classification. We validate the efficacy of this framework through extensive experiments using benchmark datasets and attacks. We believe that this work opens a new direction for designing adversarial input detection methods. As for the second contribution, we discuss a completely different approach to generate adversarial examples, based on the spatial transformation of an input image. We then extend a currently proposed certificate framework to this setting and show that the certificate can improve the resilience of a network against adversarial spatial transformation

Lehigh University: Lehigh Preserve

Interpreting Adversarially Trained Convolutional Neural Networks

Author: Zhang Tianyuan
Zhu Zhanxing
Publication venue
Publication date: 01/01/2019
Field of study

We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation. We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and images under different transformations. The comparison could visually show that the prediction of the two types of CNNs is sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.Comment: To apper in ICML1

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Channel-Recurrent Autoencoding for Image Modeling

Author: Shang Wenling
Sohn Kihyuk
Tian Yuandong
Publication venue
Publication date: 11/03/2018
Field of study

Despite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types, potentially due to the oversimplification of their latent space constructions. To tackle this issue, building on Variational Autoencoders (VAEs), we integrate recurrent connections across channels to both inference and generation steps, allowing the high-level features to be captured in global-to-local, coarse-to-fine manners. Combined with adversarial loss, our channel-recurrent VAE-GAN (crVAE-GAN) outperforms VAE-GAN in generating a diverse spectrum of high resolution images while maintaining the same level of computational efficacy. Our model produces interpretable and expressive latent representations to benefit downstream tasks such as image completion. Moreover, we propose two novel regularizations, namely the KL objective weighting scheme over time steps and mutual information maximization between transformed latent variables and the outputs, to enhance the training.Comment: Code: https://github.com/WendyShang/crVAE. Supplementary Materials: http://www-personal.umich.edu/~shangw/wacv18_supplementary_material.pd

arXiv.org e-Print Archive

Crossref