AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion
  Models

Dai, Xuelong; Liang, Kaisheng; Xiao, Bin

AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models

Authors: Xuelong Dai
Kaisheng Liang
Bin Xiao
Publication date: 23 July 2023
Publisher

Abstract

Unrestricted adversarial attacks present a serious threat to deep learning models and adversarial defense techniques. They pose severe security problems for deep learning applications because they can effectively bypass defense mechanisms. However, previous attack methods often utilize Generative Adversarial Networks (GANs), which are not theoretically provable and thus generate unrealistic examples by incorporating adversarial objectives, especially for large-scale datasets like ImageNet. In this paper, we propose a new method, called AdvDiff, to generate unrestricted adversarial examples with diffusion models. We design two novel adversarial guidance techniques to conduct adversarial sampling in the reverse generation process of diffusion models. These two techniques are effective and stable to generate high-quality, realistic adversarial examples by integrating gradients of the target classifier interpretably. Experimental results on MNIST and ImageNet datasets demonstrate that AdvDiff is effective to generate unrestricted adversarial examples, which outperforms GAN-based methods in terms of attack performance and generation quality

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.12499

Last time updated on 28/07/2023