1,014 research outputs found
A Survey on Transferability of Adversarial Examples across Deep Neural Networks
The emergence of Deep Neural Networks (DNNs) has revolutionized various
domains, enabling the resolution of complex tasks spanning image recognition,
natural language processing, and scientific problem-solving. However, this
progress has also exposed a concerning vulnerability: adversarial examples.
These crafted inputs, imperceptible to humans, can manipulate machine learning
models into making erroneous predictions, raising concerns for safety-critical
applications. An intriguing property of this phenomenon is the transferability
of adversarial examples, where perturbations crafted for one model can deceive
another, often with a different architecture. This intriguing property enables
"black-box" attacks, circumventing the need for detailed knowledge of the
target model. This survey explores the landscape of the adversarial
transferability of adversarial examples. We categorize existing methodologies
to enhance adversarial transferability and discuss the fundamental principles
guiding each approach. While the predominant body of research primarily
concentrates on image classification, we also extend our discussion to
encompass other vision tasks and beyond. Challenges and future prospects are
discussed, highlighting the importance of fortifying DNNs against adversarial
vulnerabilities in an evolving landscape
Boosting Adversarial Transferability via Fusing Logits of Top-1 Decomposed Feature
Recent research has shown that Deep Neural Networks (DNNs) are highly
vulnerable to adversarial samples, which are highly transferable and can be
used to attack other unknown black-box models. To improve the transferability
of adversarial samples, several feature-based adversarial attack methods have
been proposed to disrupt neuron activation in middle layers. However, current
state-of-the-art feature-based attack methods typically require additional
computation costs for estimating the importance of neurons. To address this
challenge, we propose a Singular Value Decomposition (SVD)-based feature-level
attack method. Our approach is inspired by the discovery that eigenvectors
associated with the larger singular values decomposed from the middle layer
features exhibit superior generalization and attention properties.
Specifically, we conduct the attack by retaining the decomposed Top-1 singular
value-associated feature for computing the output logits, which are then
combined with the original logits to optimize adversarial perturbations. Our
extensive experimental results verify the effectiveness of our proposed method,
which significantly enhances the transferability of adversarial samples against
various baseline models and defense strategies.The source code of this study is
available at \href{https://anonymous.4open.science/r/SVD-SSA-13BF/README.md}
Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization
Vision transformers (ViTs) have been successfully deployed in a variety of
computer vision tasks, but they are still vulnerable to adversarial samples.
Transfer-based attacks use a local model to generate adversarial samples and
directly transfer them to attack a target black-box model. The high efficiency
of transfer-based attacks makes it a severe security threat to ViT-based
applications. Therefore, it is vital to design effective transfer-based attacks
to identify the deficiencies of ViTs beforehand in security-sensitive
scenarios. Existing efforts generally focus on regularizing the input gradients
to stabilize the updated direction of adversarial samples. However, the
variance of the back-propagated gradients in intermediate blocks of ViTs may
still be large, which may make the generated adversarial samples focus on some
model-specific features and get stuck in poor local optima. To overcome the
shortcomings of existing approaches, we propose the Token Gradient
Regularization (TGR) method. According to the structural characteristics of
ViTs, TGR reduces the variance of the back-propagated gradient in each internal
block of ViTs in a token-wise manner and utilizes the regularized gradient to
generate adversarial samples. Extensive experiments on attacking both ViTs and
CNNs confirm the superiority of our approach. Notably, compared to the
state-of-the-art transfer-based attacks, our TGR offers a performance
improvement of 8.8% on average.Comment: CVPR 202
Improving the Transferability of Adversarial Samples by Path-Augmented Method
Deep neural networks have achieved unprecedented success on diverse vision
tasks. However, they are vulnerable to adversarial noise that is imperceptible
to humans. This phenomenon negatively affects their deployment in real-world
scenarios, especially security-related ones. To evaluate the robustness of a
target model in practice, transfer-based attacks craft adversarial samples with
a local model and have attracted increasing attention from researchers due to
their high efficiency. The state-of-the-art transfer-based attacks are
generally based on data augmentation, which typically augments multiple
training images from a linear path when learning adversarial samples. However,
such methods selected the image augmentation path heuristically and may augment
images that are semantics-inconsistent with the target images, which harms the
transferability of the generated adversarial samples. To overcome the pitfall,
we propose the Path-Augmented Method (PAM). Specifically, PAM first constructs
a candidate augmentation path pool. It then settles the employed augmentation
paths during adversarial sample generation with greedy search. Furthermore, to
avoid augmenting semantics-inconsistent images, we train a Semantics Predictor
(SP) to constrain the length of the augmentation path. Extensive experiments
confirm that PAM can achieve an improvement of over 4.8% on average compared
with the state-of-the-art baselines in terms of the attack success rates.Comment: 10 pages + appendix, CVPR 202
Structure Invariant Transformation for better Adversarial Transferability
Given the severe vulnerability of Deep Neural Networks (DNNs) against
adversarial examples, there is an urgent need for an effective adversarial
attack to identify the deficiencies of DNNs in security-sensitive applications.
As one of the prevalent black-box adversarial attacks, the existing
transfer-based attacks still cannot achieve comparable performance with the
white-box attacks. Among these, input transformation based attacks have shown
remarkable effectiveness in boosting transferability. In this work, we find
that the existing input transformation based attacks transform the input image
globally, resulting in limited diversity of the transformed images. We
postulate that the more diverse transformed images result in better
transferability. Thus, we investigate how to locally apply various
transformations onto the input image to improve such diversity while preserving
the structure of image. To this end, we propose a novel input transformation
based attack, called Structure Invariant Attack (SIA), which applies a random
image transformation onto each image block to craft a set of diverse images for
gradient calculation. Extensive experiments on the standard ImageNet dataset
demonstrate that SIA exhibits much better transferability than the existing
SOTA input transformation based attacks on CNN-based and transformer-based
models, showing its generality and superiority in boosting transferability.
Code is available at https://github.com/xiaosen-wang/SIT.Comment: Accepted by ICCV 202
Exploring Transferability of Multimodal Adversarial Samples for Vision-Language Pre-training Models with Contrastive Learning
Vision-language pre-training models (VLP) are vulnerable, especially to
multimodal adversarial samples, which can be crafted by adding imperceptible
perturbations on both original images and texts. However, under the black-box
setting, there have been no works to explore the transferability of multimodal
adversarial attacks against the VLP models. In this work, we take CLIP as the
surrogate model and propose a gradient-based multimodal attack method to
generate transferable adversarial samples against the VLP models. By applying
the gradient to optimize the adversarial images and adversarial texts
simultaneously, our method can better search for and attack the vulnerable
images and text information pairs. To improve the transferability of the
attack, we utilize contrastive learning including image-text contrastive
learning and intra-modal contrastive learning to have a more generalized
understanding of the underlying data distribution and mitigate the overfitting
of the surrogate model so that the generated multimodal adversarial samples
have a higher transferability for VLP models. Extensive experiments validate
the effectiveness of the proposed method
A Survey on Negative Transfer
Transfer learning (TL) tries to utilize data or knowledge from one or more
source domains to facilitate the learning in a target domain. It is
particularly useful when the target domain has few or no labeled data, due to
annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of
TL is not always guaranteed. Negative transfer (NT), i.e., the source domain
data/knowledge cause reduced learning performance in the target domain, has
been a long-standing and challenging problem in TL. Various approaches to
handle NT have been proposed in the literature. However, this filed lacks a
systematic survey on the formalization of NT, their factors and the algorithms
that handle NT. This paper proposes to fill this gap. First, the definition of
negative transfer is considered and a taxonomy of the factors are discussed.
Then, near fifty representative approaches for handling NT are categorized and
reviewed, from four perspectives: secure transfer, domain similarity
estimation, distant transfer and negative transfer mitigation. NT in related
fields, e.g., multi-task learning, lifelong learning, and adversarial attacks
are also discussed
Going Far Boosts Attack Transferability, but Do Not Do It
Deep Neural Networks (DNNs) could be easily fooled by Adversarial Examples
(AEs) with an imperceptible difference to original ones in human eyes. Also,
the AEs from attacking one surrogate DNN tend to cheat other black-box DNNs as
well, i.e., the attack transferability. Existing works reveal that adopting
certain optimization algorithms in attack improves transferability, but the
underlying reasons have not been thoroughly studied. In this paper, we
investigate the impacts of optimization on attack transferability by
comprehensive experiments concerning 7 optimization algorithms, 4 surrogates,
and 9 black-box models. Through the thorough empirical analysis from three
perspectives, we surprisingly find that the varied transferability of AEs from
optimization algorithms is strongly related to the corresponding Root Mean
Square Error (RMSE) from their original samples. On such a basis, one could
simply approach high transferability by attacking until RMSE decreases, which
motives us to propose a LArge RMSE Attack (LARA). Although LARA significantly
improves transferability by 20%, it is insufficient to exploit the
vulnerability of DNNs, leading to a natural urge that the strength of all
attacks should be measured by both the widely used bound and the
RMSE addressed in this paper, so that tricky enhancement of transferability
would be avoided
- …