Search CORE

931 research outputs found

Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization

Author: An Gaon
Moon Seungyong
Song Hyun Oh
Publication venue
Publication date: 16/05/2019
Field of study

Solving for adversarial examples with projected gradient descent has been demonstrated to be highly effective in fooling the neural network based classifiers. However, in the black-box setting, the attacker is limited only to the query access to the network and solving for a successful adversarial example becomes much more difficult. To this end, recent methods aim at estimating the true gradient signal based on the input queries but at the cost of excessive queries. We propose an efficient discrete surrogate to the optimization problem which does not require estimating the gradient and consequently becomes free of the first order update hyperparameters to tune. Our experiments on Cifar-10 and ImageNet show the state of the art black-box attack performance with significant reduction in the required queries compared to a number of recently proposed methods. The source code is available at https://github.com/snu-mllab/parsimonious-blackbox-attack.Comment: Accepted and to appear at ICML 201

arXiv.org e-Print Archive

Robust Deep Learning Models Against Semantic-Preserving Adversarial Attack

Author: Gao Dashan
Mao Bifei
Yao Xin
Yao Yinghua
Zhang Zeqi
Zhao Yunce
Publication venue
Publication date: 08/04/2023
Field of study

Deep learning models can be fooled by small

l_p

-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointly-perturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small

l_{\infty}

norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations.Comment: Paper accepted by the 2023 International Joint Conference on Neural Networks (IJCNN 2023

arXiv.org e-Print Archive