69 research outputs found
A new Backdoor Attack in CNNs by training set corruption without label poisoning
Backdoor attacks against CNNs represent a new threat against deep learning
systems, due to the possibility of corrupting the training set so to induce an
incorrect behaviour at test time. To avoid that the trainer recognises the
presence of the corrupted samples, the corruption of the training set must be
as stealthy as possible. Previous works have focused on the stealthiness of the
perturbation injected into the training samples, however they all assume that
the labels of the corrupted samples are also poisoned. This greatly reduces the
stealthiness of the attack, since samples whose content does not agree with the
label can be identified by visual inspection of the training set or by running
a pre-classification step. In this paper we present a new backdoor attack
without label poisoning Since the attack works by corrupting only samples of
the target class, it has the additional advantage that it does not need to
identify beforehand the class of the samples to be attacked at test time.
Results obtained on the MNIST digits recognition task and the traffic signs
classification task show that backdoor attacks without label poisoning are
indeed possible, thus raising a new alarm regarding the use of deep learning in
security-critical applications
Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition
Deep neural networks (DNNs) are vulnerable to a class of attacks called
"backdoor attacks", which create an association between a backdoor trigger and
a target label the attacker is interested in exploiting. A backdoored DNN
performs well on clean test images, yet persistently predicts an
attacker-defined label for any sample in the presence of the backdoor trigger.
Although backdoor attacks have been extensively studied in the image domain,
there are very few works that explore such attacks in the video domain, and
they tend to conclude that image backdoor attacks are less effective in the
video domain. In this work, we revisit the traditional backdoor threat model
and incorporate additional video-related aspects to that model. We show that
poisoned-label image backdoor attacks could be extended temporally in two ways,
statically and dynamically, leading to highly effective attacks in the video
domain. In addition, we explore natural video backdoors to highlight the
seriousness of this vulnerability in the video domain. And, for the first time,
we study multi-modal (audiovisual) backdoor attacks against video action
recognition models, where we show that attacking a single modality is enough
for achieving a high attack success rate
Backdoor Attacks on Crowd Counting
Crowd counting is a regression task that estimates the number of people in a
scene image, which plays a vital role in a range of safety-critical
applications, such as video surveillance, traffic monitoring and flow control.
In this paper, we investigate the vulnerability of deep learning based crowd
counting models to backdoor attacks, a major security threat to deep learning.
A backdoor attack implants a backdoor trigger into a target model via data
poisoning so as to control the model's predictions at test time. Different from
image classification models on which most of existing backdoor attacks have
been developed and tested, crowd counting models are regression models that
output multi-dimensional density maps, thus requiring different techniques to
manipulate.
In this paper, we propose two novel Density Manipulation Backdoor Attacks
(DMBA and DMBA) to attack the model to produce arbitrarily large or
small density estimations. Experimental results demonstrate the effectiveness
of our DMBA attacks on five classic crowd counting models and four types of
datasets. We also provide an in-depth analysis of the unique challenges of
backdooring crowd counting models and reveal two key elements of effective
attacks: 1) full and dense triggers and 2) manipulation of the ground truth
counts or density maps. Our work could help evaluate the vulnerability of crowd
counting models to potential backdoor attacks.Comment: To appear in ACMMM 2022. 10pages, 6 figures and 2 table
Luminance-based video backdoor attack against anti-spoofing rebroadcast detection
We introduce a new backdoor attack against a deep-learning video rebroadcast detection network. In addition to the difficulties of working with video signals rather than still images, injecting a backdoor into a deep learning model for rebroadcast detection presents the additional problem that the backdoor must survive the digital-to-analog and analog-to-digital conversion associated to video rebroadcast. To cope with this problem, we have built a backdoor attack that works by varying the average luminance of video frames according to a predesigned sinusoidal function. In this way, robustness against geometric transformation is automatically achieved, together with a good robustness against luminance transformations associated to display and recapture, like Gamma correction and white balance. Our experiments demonstrate the effectiveness of the proposed backdoor attack, especially when the attack is carried out by also corrupting the labels of the attacked training samples
Everyone Can Attack: Repurpose Lossy Compression as a Natural Backdoor Attack
The vulnerabilities to backdoor attacks have recently threatened the
trustworthiness of machine learning models in practical applications.
Conventional wisdom suggests that not everyone can be an attacker since the
process of designing the trigger generation algorithm often involves
significant effort and extensive experimentation to ensure the attack's
stealthiness and effectiveness. Alternatively, this paper shows that there
exists a more severe backdoor threat: anyone can exploit an easily-accessible
algorithm for silent backdoor attacks. Specifically, this attacker can employ
the widely-used lossy image compression from a plethora of compression tools to
effortlessly inject a trigger pattern into an image without leaving any
noticeable trace; i.e., the generated triggers are natural artifacts. One does
not require extensive knowledge to click on the "convert" or "save as" button
while using tools for lossy image compression. Via this attack, the adversary
does not need to design a trigger generator as seen in prior works and only
requires poisoning the data. Empirically, the proposed attack consistently
achieves 100% attack success rate in several benchmark datasets such as MNIST,
CIFAR-10, GTSRB and CelebA. More significantly, the proposed attack can still
achieve almost 100% attack success rate with very small (approximately 10%)
poisoning rates in the clean label setting. The generated trigger of the
proposed attack using one lossy compression algorithm is also transferable
across other related compression algorithms, exacerbating the severity of this
backdoor threat. This work takes another crucial step toward understanding the
extensive risks of backdoor attacks in practice, urging practitioners to
investigate similar attacks and relevant backdoor mitigation methods.Comment: 14 pages. This paper shows everyone can mount a powerful and stealthy
backdoor attack with the widely-used lossy image compressio
Towards Understanding How Self-training Tolerates Data Backdoor Poisoning
Recent studies on backdoor attacks in model training have shown that
polluting a small portion of training data is sufficient to produce incorrect
manipulated predictions on poisoned test-time data while maintaining high clean
accuracy in downstream tasks. The stealthiness of backdoor attacks has imposed
tremendous defense challenges in today's machine learning paradigm. In this
paper, we explore the potential of self-training via additional unlabeled data
for mitigating backdoor attacks. We begin by making a pilot study to show that
vanilla self-training is not effective in backdoor mitigation. Spurred by that,
we propose to defend the backdoor attacks by leveraging strong but proper data
augmentations in the self-training pseudo-labeling stage. We find that the new
self-training regime help in defending against backdoor attacks to a great
extent. Its effectiveness is demonstrated through experiments for different
backdoor triggers on CIFAR-10 and a combination of CIFAR-10 with an additional
unlabeled 500K TinyImages dataset. Finally, we explore the direction of
combining self-supervised representation learning with self-training for
further improvement in backdoor defense.Comment: Accepted at SafeAI 2023: AAAI's Workshop on Artificial Intelligence
Safet
Label Poisoning is All You Need
In a backdoor attack, an adversary injects corrupted data into a model's
training dataset in order to gain control over its predictions on images with a
specific attacker-defined trigger. A typical corrupted training example
requires altering both the image, by applying the trigger, and the label.
Models trained on clean images, therefore, were considered safe from backdoor
attacks. However, in some common machine learning scenarios, the training
labels are provided by potentially malicious third-parties. This includes
crowd-sourced annotation and knowledge distillation. We, hence, investigate a
fundamental question: can we launch a successful backdoor attack by only
corrupting labels? We introduce a novel approach to design label-only backdoor
attacks, which we call FLIP, and demonstrate its strengths on three datasets
(CIFAR-10, CIFAR-100, and Tiny-ImageNet) and four architectures (ResNet-32,
ResNet-18, VGG-19, and Vision Transformer). With only 2% of CIFAR-10 labels
corrupted, FLIP achieves a near-perfect attack success rate of 99.4% while
suffering only a 1.8% drop in the clean test accuracy. Our approach builds upon
the recent advances in trajectory matching, originally introduced for dataset
distillation
- …