625 research outputs found
BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models
The rise in popularity of text-to-image generative artificial intelligence
(AI) has attracted widespread public interest. We demonstrate that this
technology can be attacked to generate content that subtly manipulates its
users. We propose a Backdoor Attack on text-to-image Generative Models (BAGM),
which upon triggering, infuses the generated images with manipulative details
that are naturally blended in the content. Our attack is the first to target
three popular text-to-image generative models across three stages of the
generative process by modifying the behaviour of the embedded tokenizer, the
language model or the image generative model. Based on the penetration level,
BAGM takes the form of a suite of attacks that are referred to as surface,
shallow and deep attacks in this article. Given the existing gap within this
domain, we also contribute a comprehensive set of quantitative metrics designed
specifically for assessing the effectiveness of backdoor attacks on
text-to-image models. The efficacy of BAGM is established by attacking
state-of-the-art generative models, using a marketing scenario as the target
domain. To that end, we contribute a dataset of branded product images. Our
embedded backdoors increase the bias towards the target outputs by more than
five times the usual, without compromising the model robustness or the
generated content utility. By exposing generative AI's vulnerabilities, we
encourage researchers to tackle these challenges and practitioners to exercise
caution when using pre-trained models. Relevant code, input prompts and
supplementary material can be found at https://github.com/JJ-Vice/BAGM, and the
dataset is available at:
https://ieee-dataport.org/documents/marketable-foods-mf-dataset.
Keywords: Generative Artificial Intelligence, Generative Models,
Text-to-Image generation, Backdoor Attacks, Trojan, Stable Diffusion.Comment: This research was supported by National Intelligence and Security
Discovery Research Grants (project# NS220100007), funded by the Department of
Defence Australi
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Deep Neural Networks (DNNs) have led to unprecedented progress in various
natural language processing (NLP) tasks. Owing to limited data and computation
resources, using third-party data and models has become a new paradigm for
adapting various tasks. However, research shows that it has some potential
security vulnerabilities because attackers can manipulate the training process
and data source. Such a way can set specific triggers, making the model exhibit
expected behaviors that have little inferior influence on the model's
performance for primitive tasks, called backdoor attacks. Hence, it could have
dire consequences, especially considering that the backdoor attack surfaces are
broad.
To get a precise grasp and understanding of this problem, a systematic and
comprehensive review is required to confront various security challenges from
different phases and attack purposes. Additionally, there is a dearth of
analysis and comparison of the various emerging backdoor countermeasures in
this situation. In this paper, we conduct a timely review of backdoor attacks
and countermeasures to sound the red alarm for the NLP security community.
According to the affected stage of the machine learning pipeline, the attack
surfaces are recognized to be wide and then formalized into three
categorizations: attacking pre-trained model with fine-tuning (APMF) or
prompt-tuning (APMP), and attacking final model with training (AFMT), where
AFMT can be subdivided into different attack aims. Thus, attacks under each
categorization are combed. The countermeasures are categorized into two general
classes: sample inspection and model inspection. Overall, the research on the
defense side is far behind the attack side, and there is no single defense that
can prevent all types of backdoor attacks. An attacker can intelligently bypass
existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure
Automated Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces
Embedded devices are becoming more widespread, interconnected, and
web-enabled than ever. However, recent studies showed that these devices are
far from being secure. Moreover, many embedded systems rely on web interfaces
for user interaction or administration. Unfortunately, web security is known to
be difficult, and therefore the web interfaces of embedded systems represent a
considerable attack surface.
In this paper, we present the first fully automated framework that applies
dynamic firmware analysis techniques to achieve, in a scalable manner,
automated vulnerability discovery within embedded firmware images. We apply our
framework to study the security of embedded web interfaces running in
Commercial Off-The-Shelf (COTS) embedded devices, such as routers, DSL/cable
modems, VoIP phones, IP/CCTV cameras. We introduce a methodology and implement
a scalable framework for discovery of vulnerabilities in embedded web
interfaces regardless of the vendor, device, or architecture. To achieve this
goal, our framework performs full system emulation to achieve the execution of
firmware images in a software-only environment, i.e., without involving any
physical embedded devices. Then, we analyze the web interfaces within the
firmware using both static and dynamic tools. We also present some interesting
case-studies, and discuss the main challenges associated with the dynamic
analysis of firmware images and their web interfaces and network services. The
observations we make in this paper shed light on an important aspect of
embedded devices which was not previously studied at a large scale.
We validate our framework by testing it on 1925 firmware images from 54
different vendors. We discover important vulnerabilities in 185 firmware
images, affecting nearly a quarter of vendors in our dataset. These
experimental results demonstrate the effectiveness of our approach
Towards a Robust Defense: A Multifaceted Approach to the Detection and Mitigation of Neural Backdoor Attacks through Feature Space Exploration and Analysis
From voice assistants to self-driving vehicles, machine learning(ML), especially deep learning, revolutionizes the way we work and live, through the wide adoption in a broad range of applications. Unfortunately, this widespread use makes deep learning-based systems a desirable target for cyberattacks, such as generating adversarial examples to fool a deep learning system to make wrong decisions. In particular, many recent studies have revealed that attackers can corrupt the training of a deep learning model, e.g., through data poisoning, or distribute a deep learning model they created with “backdoors” planted, e.g., distributed as part of a software library, so that the attacker can easily craft system inputs that grant unauthorized access or lead to catastrophic errors or failures.
This dissertation aims to develop a multifaceted approach for detecting and mitigating such neural backdoor attacks by exploiting their unique characteristics in the feature space. First of all, a framework called GangSweep is designed to utilize the capabilities of Generative Adversarial Networks (GAN) to approximate poisoned sample distributions in the feature space, to detect neural backdoor attacks. Unlike conventional methods, GangSweep exposes all attacker-induced artifacts, irrespective of their complexity or obscurity. By leveraging the statistical disparities between these artifacts and natural adversarial perturbations, an efficient detection scheme is devised. Accordingly, the backdoored model can be purified through label correction and fine-tuning
Secondly, this dissertation focuses on the sample-targeted backdoor attacks, a variant of neural backdoor that targets specific samples. Given the absence of explicit triggers in such models, traditional detection methods falter. Through extensive analysis, I have identified a unique feature space property of these attacks, where they induce boundary alterations, creating discernible “pockets” around target samples. Based on this critical observation, I introduce a novel defense scheme that encapsulates these malicious pockets within a tight convex hull in the feature space, and then design an algorithm to identify such hulls and remove the backdoor through model fine-tuning. The algorithm demonstrates high efficacy against a spectrum of sample-targeted backdoor attacks.
Lastly, I address the emerging challenge of backdoor attacks in multimodal deep neural networks, in particular vision-language model, a growing concern in real-world applications. Discovering that there is a strong association between the image trigger and the target text in the feature space of the backdoored vision-language model, I design an effective algorithm to expose the malicious text and image trigger by jointly searching in the shared feature space of the vision and language modalities
- …