67 research outputs found
A new Backdoor Attack in CNNs by training set corruption without label poisoning
Backdoor attacks against CNNs represent a new threat against deep learning
systems, due to the possibility of corrupting the training set so to induce an
incorrect behaviour at test time. To avoid that the trainer recognises the
presence of the corrupted samples, the corruption of the training set must be
as stealthy as possible. Previous works have focused on the stealthiness of the
perturbation injected into the training samples, however they all assume that
the labels of the corrupted samples are also poisoned. This greatly reduces the
stealthiness of the attack, since samples whose content does not agree with the
label can be identified by visual inspection of the training set or by running
a pre-classification step. In this paper we present a new backdoor attack
without label poisoning Since the attack works by corrupting only samples of
the target class, it has the additional advantage that it does not need to
identify beforehand the class of the samples to be attacked at test time.
Results obtained on the MNIST digits recognition task and the traffic signs
classification task show that backdoor attacks without label poisoning are
indeed possible, thus raising a new alarm regarding the use of deep learning in
security-critical applications
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences
Together with impressive advances touching every aspect of our society, AI technology based on Deep Neural Networks (DNN) is bringing increasing security concerns. While attacks operating at test time have monopolised the initial attention of researchers, backdoor attacks, exploiting the possibility of corrupting DNN models by interfering with the training process, represent a further serious threat undermining the dependability of AI techniques. In backdoor attacks, the attacker corrupts the training data to induce an erroneous behaviour at test time. Test-time errors, however, are activated only in the presence of a triggering event. In this way, the corrupted network continues to work as expected for regular inputs, and the malicious behaviour occurs only when the attacker decides to activate the backdoor hidden within the network. Recently, backdoor attacks have been an intense research domain focusing on both the development of new classes of attacks, and the proposal of possible countermeasures. The goal of this overview is to review the works published until now, classifying the different types of attacks and defences proposed so far. The classification guiding the analysis is based on the amount of control that the attacker has on the training process, and the capability of the defender to verify the integrity of the data used for training, and to monitor the operations of the DNN at training and test time. Hence, the proposed analysis is suited to highlight the strengths and weaknesses of both attacks and defences with reference to the application scenarios they are operating in
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences
Together with impressive advances touching every aspect of our society, AI technology based on Deep Neural Networks (DNN) is bringing increasing security concerns. While attacks operating at test time have monopolised the initial attention of researchers, backdoor attacks, exploiting the possibility of corrupting DNN models by interfering with the training process, represent a further serious threat undermining the dependability of AI techniques. In backdoor attacks, the attacker corrupts the training data to induce an erroneous behaviour at test time. Test-time errors, however, are activated only in the presence of a triggering event. In this way, the corrupted network continues to work as expected for regular inputs, and the malicious behaviour occurs only when the attacker decides to activate the backdoor hidden within the network. Recently, backdoor attacks have been an intense research domain focusing on both the development of new classes of attacks, and the proposal of possible countermeasures. The goal of this overview is to review the works published until now, classifying the different types of attacks and defences proposed so far. The classification guiding the analysis is based on the amount of control that the attacker has on the training process, and the capability of the defender to verify the integrity of the data used for training, and to monitor the operations of the DNN at training and test time. Hence, the proposed analysis is suited to highlight the strengths and weaknesses of both attacks and defences with reference to the application scenarios they are operating in
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition
Deep neural networks (DNNs) have achieved tremendous success in various
applications including video action recognition, yet remain vulnerable to
backdoor attacks (Trojans). The backdoor-compromised model will mis-classify to
the target class chosen by the attacker when a test instance (from a non-target
class) is embedded with a specific trigger, while maintaining high accuracy on
attack-free instances. Although there are extensive studies on backdoor attacks
against image data, the susceptibility of video-based systems under backdoor
attacks remains largely unexplored. Current studies are direct extensions of
approaches proposed for image data, e.g., the triggers are
\textbf{independently} embedded within the frames, which tend to be detectable
by existing defenses. In this paper, we introduce a \textit{simple} yet
\textit{effective} backdoor attack against video data. Our proposed attack,
adding perturbations in a transformed domain, plants an \textbf{imperceptible,
temporally distributed} trigger across the video frames, and is shown to be
resilient to existing defensive strategies. The effectiveness of the proposed
attack is demonstrated by extensive experiments with various well-known models
on two video recognition benchmarks, UCF101 and HMDB51, and a sign language
recognition benchmark, Greek Sign Language (GSL) dataset. We delve into the
impact of several influential factors on our proposed attack and identify an
intriguing effect termed "collateral damage" through extensive studies
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Deep Neural Networks (DNNs) have led to unprecedented progress in various
natural language processing (NLP) tasks. Owing to limited data and computation
resources, using third-party data and models has become a new paradigm for
adapting various tasks. However, research shows that it has some potential
security vulnerabilities because attackers can manipulate the training process
and data source. Such a way can set specific triggers, making the model exhibit
expected behaviors that have little inferior influence on the model's
performance for primitive tasks, called backdoor attacks. Hence, it could have
dire consequences, especially considering that the backdoor attack surfaces are
broad.
To get a precise grasp and understanding of this problem, a systematic and
comprehensive review is required to confront various security challenges from
different phases and attack purposes. Additionally, there is a dearth of
analysis and comparison of the various emerging backdoor countermeasures in
this situation. In this paper, we conduct a timely review of backdoor attacks
and countermeasures to sound the red alarm for the NLP security community.
According to the affected stage of the machine learning pipeline, the attack
surfaces are recognized to be wide and then formalized into three
categorizations: attacking pre-trained model with fine-tuning (APMF) or
prompt-tuning (APMP), and attacking final model with training (AFMT), where
AFMT can be subdivided into different attack aims. Thus, attacks under each
categorization are combed. The countermeasures are categorized into two general
classes: sample inspection and model inspection. Overall, the research on the
defense side is far behind the attack side, and there is no single defense that
can prevent all types of backdoor attacks. An attacker can intelligently bypass
existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Although backdoor learning is an active research topic in the NLP domain, the
literature lacks studies that systematically categorize and summarize backdoor
attacks and defenses. To bridge the gap, we present a comprehensive and
unifying study of backdoor learning for NLP by summarizing the literature in a
systematic manner. We first present and motivate the importance of backdoor
learning for building robust NLP systems. Next, we provide a thorough account
of backdoor attack techniques, their applications, defenses against backdoor
attacks, and various mitigation techniques to remove backdoor attacks. We then
provide a detailed review and analysis of evaluation metrics, benchmark
datasets, threat models, and challenges related to backdoor learning in NLP.
Ultimately, our work aims to crystallize and contextualize the landscape of
existing literature in backdoor learning for the text domain and motivate
further research in the field. To this end, we identify troubling gaps in the
literature and offer insights and ideas into open challenges and future
research directions. Finally, we provide a GitHub repository with a list of
backdoor learning papers that will be continuously updated at
https://github.com/marwanomar1/Backdoor-Learning-for-NLP
Physical Adversarial Attack meets Computer Vision: A Decade Survey
Although Deep Neural Networks (DNNs) have achieved impressive results in
computer vision, their exposed vulnerability to adversarial attacks remains a
serious concern. A series of works has shown that by adding elaborate
perturbations to images, DNNs could have catastrophic degradation in
performance metrics. And this phenomenon does not only exist in the digital
space but also in the physical space. Therefore, estimating the security of
these DNNs-based systems is critical for safely deploying them in the real
world, especially for security-critical applications, e.g., autonomous cars,
video surveillance, and medical diagnosis. In this paper, we focus on physical
adversarial attacks and provide a comprehensive survey of over 150 existing
papers. We first clarify the concept of the physical adversarial attack and
analyze its characteristics. Then, we define the adversarial medium, essential
to perform attacks in the physical world. Next, we present the physical
adversarial attack methods in task order: classification, detection, and
re-identification, and introduce their performance in solving the trilemma:
effectiveness, stealthiness, and robustness. In the end, we discuss the current
challenges and potential future directions.Comment: 32 pages. Under Revie
- …