67 research outputs found
Generative Adversarial Perturbations
In this paper, we propose novel generative models for creating adversarial
examples, slightly perturbed images resembling natural images but maliciously
crafted to fool pre-trained models. We present trainable deep neural networks
for transforming images to adversarial perturbations. Our proposed models can
produce image-agnostic and image-dependent perturbations for both targeted and
non-targeted attacks. We also demonstrate that similar architectures can
achieve impressive results in fooling classification and semantic segmentation
models, obviating the need for hand-crafting attack methods for each task.
Using extensive experiments on challenging high-resolution datasets such as
ImageNet and Cityscapes, we show that our perturbations achieve high fooling
rates with small perturbation norms. Moreover, our attacks are considerably
faster than current iterative methods at inference time.Comment: CVPR 2018, camera-ready versio
Attacking logo-based phishing website detectors with adversarial perturbations
Recent times have witnessed the rise of anti-phishing schemes powered by deep
learning (DL). In particular, logo-based phishing detectors rely on DL models
from Computer Vision to identify logos of well-known brands on webpages, to
detect malicious webpages that imitate a given brand. For instance, Siamese
networks have demonstrated notable performance for these tasks, enabling the
corresponding anti-phishing solutions to detect even "zero-day" phishing
webpages. In this work, we take the next step of studying the robustness of
logo-based phishing detectors against adversarial ML attacks. We propose a
novel attack exploiting generative adversarial perturbations to craft
"adversarial logos" that evade phishing detectors. We evaluate our attacks
through: (i) experiments on datasets containing real logos, to evaluate the
robustness of state-of-the-art phishing detectors; and (ii) user studies to
gauge whether our adversarial logos can deceive human eyes. The results show
that our proposed attack is capable of crafting perturbed logos subtle enough
to evade various DL models-achieving an evasion rate of up to 95%. Moreover,
users are not able to spot significant differences between generated
adversarial logos and original ones.Comment: To appear in ESORICS 202
CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator
Deep neural networks (DNNs) are vulnerable to adversarial attack despite
their tremendous success in many AI fields. Adversarial attack is a method that
causes the intended misclassfication by adding imperceptible perturbations to
legitimate inputs. Researchers have developed numerous types of adversarial
attack methods. However, from the perspective of practical deployment, these
methods suffer from several drawbacks such as long attack generating time, high
memory cost, insufficient robustness and low transferability. We propose a
Content-aware Adversarial Attack Generator (CAG) to achieve real-time,
low-cost, enhanced-robustness and high-transferability adversarial attack.
First, as a type of generative model-based attack, CAG shows significant
speedup (at least 500 times) in generating adversarial examples compared to the
state-of-the-art attacks such as PGD and C\&W. CAG only needs a single
generative model to perform targeted attack to any targeted class. Because CAG
encodes the label information into a trainable embedding layer, it differs from
prior generative model-based adversarial attacks that use different copies
of generative models for different targeted classes. As a result, CAG
significantly reduces the required memory cost for generating adversarial
examples. CAG can generate adversarial perturbations that focus on the critical
areas of input by integrating the class activation maps information in the
training process, and hence improve the robustness of CAG attack against the
state-of-art adversarial defenses. In addition, CAG exhibits high
transferability across different DNN classifier models in black-box attack
scenario by introducing random dropout in the process of generating
perturbations. Extensive experiments on different datasets and DNN models have
verified the real-time, low-cost, enhanced-robustness, and high-transferability
benefits of CAG
Learning Universal Adversarial Perturbations with Generative Models
Neural networks are known to be vulnerable to adversarial examples, inputs
that have been intentionally perturbed to remain visually similar to the source
input, but cause a misclassification. It was recently shown that given a
dataset and classifier, there exists so called universal adversarial
perturbations, a single perturbation that causes a misclassification when
applied to any input. In this work, we introduce universal adversarial
networks, a generative network that is capable of fooling a target classifier
when it's generated output is added to a clean sample from a dataset. We show
that this technique improves on known universal adversarial attacks
Mutual-modality Adversarial Attack with Semantic Perturbation
Adversarial attacks constitute a notable threat to machine learning systems,
given their potential to induce erroneous predictions and classifications.
However, within real-world contexts, the essential specifics of the deployed
model are frequently treated as a black box, consequently mitigating the
vulnerability to such attacks. Thus, enhancing the transferability of the
adversarial samples has become a crucial area of research, which heavily relies
on selecting appropriate surrogate models. To address this challenge, we
propose a novel approach that generates adversarial attacks in a
mutual-modality optimization scheme. Our approach is accomplished by leveraging
the pre-trained CLIP model. Firstly, we conduct a visual attack on the clean
image that causes semantic perturbations on the aligned embedding space with
the other textual modality. Then, we apply the corresponding defense on the
textual modality by updating the prompts, which forces the re-matching on the
perturbed embedding space. Finally, to enhance the attack transferability, we
utilize the iterative training strategy on the visual attack and the textual
defense, where the two processes optimize from each other. We evaluate our
approach on several benchmark datasets and demonstrate that our mutual-modal
attack strategy can effectively produce high-transferable attacks, which are
stable regardless of the target networks. Our approach outperforms
state-of-the-art attack methods and can be readily deployed as a plug-and-play
solution.Comment: Accepted by AAAI202
- …