307 research outputs found
A Survey on Backdoor Attack and Defense in Natural Language Processing
Deep learning is becoming increasingly popular in real-life applications,
especially in natural language processing (NLP). Users often choose training
outsourcing or adopt third-party data and models due to data and computation
resources being limited. In such a situation, training data and models are
exposed to the public. As a result, attackers can manipulate the training
process to inject some triggers into the model, which is called backdoor
attack. Backdoor attack is quite stealthy and difficult to be detected because
it has little inferior influence on the model's performance for the clean
samples. To get a precise grasp and understanding of this problem, in this
paper, we conduct a comprehensive review of backdoor attacks and defenses in
the field of NLP. Besides, we summarize benchmark datasets and point out the
open issues to design credible systems to defend against backdoor attacks.Comment: 12 pages, QRS202
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Although backdoor learning is an active research topic in the NLP domain, the
literature lacks studies that systematically categorize and summarize backdoor
attacks and defenses. To bridge the gap, we present a comprehensive and
unifying study of backdoor learning for NLP by summarizing the literature in a
systematic manner. We first present and motivate the importance of backdoor
learning for building robust NLP systems. Next, we provide a thorough account
of backdoor attack techniques, their applications, defenses against backdoor
attacks, and various mitigation techniques to remove backdoor attacks. We then
provide a detailed review and analysis of evaluation metrics, benchmark
datasets, threat models, and challenges related to backdoor learning in NLP.
Ultimately, our work aims to crystallize and contextualize the landscape of
existing literature in backdoor learning for the text domain and motivate
further research in the field. To this end, we identify troubling gaps in the
literature and offer insights and ideas into open challenges and future
research directions. Finally, we provide a GitHub repository with a list of
backdoor learning papers that will be continuously updated at
https://github.com/marwanomar1/Backdoor-Learning-for-NLP
Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review
Deep Neural Networks (DNNs) have led to unprecedented progress in various
natural language processing (NLP) tasks. Owing to limited data and computation
resources, using third-party data and models has become a new paradigm for
adapting various tasks. However, research shows that it has some potential
security vulnerabilities because attackers can manipulate the training process
and data source. Such a way can set specific triggers, making the model exhibit
expected behaviors that have little inferior influence on the model's
performance for primitive tasks, called backdoor attacks. Hence, it could have
dire consequences, especially considering that the backdoor attack surfaces are
broad.
To get a precise grasp and understanding of this problem, a systematic and
comprehensive review is required to confront various security challenges from
different phases and attack purposes. Additionally, there is a dearth of
analysis and comparison of the various emerging backdoor countermeasures in
this situation. In this paper, we conduct a timely review of backdoor attacks
and countermeasures to sound the red alarm for the NLP security community.
According to the affected stage of the machine learning pipeline, the attack
surfaces are recognized to be wide and then formalized into three
categorizations: attacking pre-trained model with fine-tuning (APMF) or
prompt-tuning (APMP), and attacking final model with training (AFMT), where
AFMT can be subdivided into different attack aims. Thus, attacks under each
categorization are combed. The countermeasures are categorized into two general
classes: sample inspection and model inspection. Overall, the research on the
defense side is far behind the attack side, and there is no single defense that
can prevent all types of backdoor attacks. An attacker can intelligently bypass
existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure
Hiding Backdoors within Event Sequence Data via Poisoning Attacks
The financial industry relies on deep learning models for making important
decisions. This adoption brings new danger, as deep black-box models are known
to be vulnerable to adversarial attacks. In computer vision, one can shape the
output during inference by performing an adversarial attack called poisoning
via introducing a backdoor into the model during training. For sequences of
financial transactions of a customer, insertion of a backdoor is harder to
perform, as models operate over a more complex discrete space of sequences, and
systematic checks for insecurities occur. We provide a method to introduce
concealed backdoors, creating vulnerabilities without altering their
functionality for uncontaminated data. To achieve this, we replace a clean
model with a poisoned one that is aware of the availability of a backdoor and
utilize this knowledge. Our most difficult for uncovering attacks include
either additional supervised detection step of poisoned data activated during
the test or well-hidden model weight modifications. The experimental study
provides insights into how these effects vary across different datasets,
architectures, and model components. Alternative methods and baselines, such as
distillation-type regularization, are also explored but found to be less
efficient. Conducted on three open transaction datasets and architectures,
including LSTM, CNN, and Transformer, our findings not only illuminate the
vulnerabilities in contemporary models but also can drive the construction of
more robust systems
- …