1,590 research outputs found
A Survey on Federated Learning Poisoning Attacks and Defenses
As one kind of distributed machine learning technique, federated learning
enables multiple clients to build a model across decentralized data
collaboratively without explicitly aggregating the data. Due to its ability to
break data silos, federated learning has received increasing attention in many
fields, including finance, healthcare, and education. However, the invisibility
of clients' training data and the local training process result in some
security issues. Recently, many works have been proposed to research the
security attacks and defenses in federated learning, but there has been no
special survey on poisoning attacks on federated learning and the corresponding
defenses. In this paper, we investigate the most advanced schemes of federated
learning poisoning attacks and defenses and point out the future directions in
these areas
Recommended from our members
Quantifying and Enhancing the Security of Federated Learning
Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google\u27s Gboard and Assistant, Apple\u27s Siri, Owkin\u27s health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning model.
In this work, we take significant steps towards this challenge. We start by providing a systemization of poisoning adversaries in federated learning and use it to build adversaries with varying strengths and to show how some adversaries common in the prior literature are not practically relevant. For the majority of this thesis, we focus on untargeted poisoning as it can impact much larger federated learning population than other types of poisoning and also because most of the prior poisoning defenses for federated learning aim to defend against untargeted poisoning.
% Next, we introduce a general framework to design strong untargeted poisoning attacks against various federated learning algorithms. Using our framework, we design state-of-the-art poisoning attacks and demonstrate how the theoretical guarantees and empirical claims of prior state-of-the-art federated learning poisoning defenses are brittle under the same strong (albeit theoretical) adversaries that these defenses aim to defend against. We also provide concrete lessons highlighting the shortcomings of prior defenses. Using these lessons, we also design two novel defenses with strong theoretical guarantees and demonstrate their state-of-the-art performances in various adversarial settings.
Finally, for the first time, we thoroughly investigate the impact of poisoning in real-world federated learning settings and draw significant, and rather surprising, conclusions about robustness of federated learning in practice. For instance, we show that contrary to the established belief, federated learning is highly robust in practice even when using simple, low-cost defenses. One of the major implications of our study is that, although interesting from theoretical perspectives, many of the strong adversaries, and hence, strong prior defenses, are of little use in practice
Security and Privacy Issues of Federated Learning
Federated Learning (FL) has emerged as a promising approach to address data
privacy and confidentiality concerns by allowing multiple participants to
construct a shared model without centralizing sensitive data. However, this
decentralized paradigm introduces new security challenges, necessitating a
comprehensive identification and classification of potential risks to ensure
FL's security guarantees. This paper presents a comprehensive taxonomy of
security and privacy challenges in Federated Learning (FL) across various
machine learning models, including large language models. We specifically
categorize attacks performed by the aggregator and participants, focusing on
poisoning attacks, backdoor attacks, membership inference attacks, generative
adversarial network (GAN) based attacks, and differential privacy attacks.
Additionally, we propose new directions for future research, seeking innovative
solutions to fortify FL systems against emerging security risks and uphold
sensitive data confidentiality in distributed learning environments.Comment: 6 pages, 2 figure
Un-fair trojan: Targeted backdoor attacks against model fairness
Machine learning models have been shown to be vulnerable against various backdoor and data poisoning attacks that adversely affect model behavior. Additionally, these attacks have been shown to make unfair predictions with respect to certain protected features. In federated learning, multiple local models contribute to a single global model communicating only using local gradients, the issue of attacks become more prevalent and complex. Previously published works revolve around solving these issues both individually and jointly. However, there has been little study on the effects of attacks against model fairness. Demonstrated in this work, a flexible attack, which we call Un-Fair Trojan, that targets model fairness while remaining stealthy can have devastating effects against machine learning models
Studying the Robustness of Anti-adversarial Federated Learning Models Detecting Cyberattacks in IoT Spectrum Sensors
Device fingerprinting combined with Machine and Deep Learning (ML/DL) report
promising performance when detecting cyberattacks targeting data managed by
resource-constrained spectrum sensors. However, the amount of data needed to
train models and the privacy concerns of such scenarios limit the applicability
of centralized ML/DL-based approaches. Federated learning (FL) addresses these
limitations by creating federated and privacy-preserving models. However, FL is
vulnerable to malicious participants, and the impact of adversarial attacks on
federated models detecting spectrum sensing data falsification (SSDF) attacks
on spectrum sensors has not been studied. To address this challenge, the first
contribution of this work is the creation of a novel dataset suitable for FL
and modeling the behavior (usage of CPU, memory, or file system, among others)
of resource-constrained spectrum sensors affected by different SSDF attacks.
The second contribution is a pool of experiments analyzing and comparing the
robustness of federated models according to i) three families of spectrum
sensors, ii) eight SSDF attacks, iii) four scenarios dealing with unsupervised
(anomaly detection) and supervised (binary classification) federated models,
iv) up to 33% of malicious participants implementing data and model poisoning
attacks, and v) four aggregation functions acting as anti-adversarial
mechanisms to increase the models robustness
SGDE: Secure Generative Data Exchange for Cross-Silo Federated Learning
Privacy regulation laws, such as GDPR, impose transparency and security as
design pillars for data processing algorithms. In this context, federated
learning is one of the most influential frameworks for privacy-preserving
distributed machine learning, achieving astounding results in many natural
language processing and computer vision tasks. Several federated learning
frameworks employ differential privacy to prevent private data leakage to
unauthorized parties and malicious attackers. Many studies, however, highlight
the vulnerabilities of standard federated learning to poisoning and inference,
thus raising concerns about potential risks for sensitive data. To address this
issue, we present SGDE, a generative data exchange protocol that improves user
security and machine learning performance in a cross-silo federation. The core
of SGDE is to share data generators with strong differential privacy guarantees
trained on private data instead of communicating explicit gradient information.
These generators synthesize an arbitrarily large amount of data that retain the
distinctive features of private samples but differ substantially. In this work,
SGDE is tested in a cross-silo federated network on images and tabular
datasets, exploiting beta-variational autoencoders as data generators. From the
results, the inclusion of SGDE turns out to improve task accuracy and fairness,
as well as resilience to the most influential attacks on federated learning
Analyzing the vulnerabilities in SplitFed Learning: Assessing the robustness against Data Poisoning Attacks
Distributed Collaborative Machine Learning (DCML) is a potential alternative
to address the privacy concerns associated with centralized machine learning.
The Split learning (SL) and Federated Learning (FL) are the two effective
learning approaches in DCML. Recently there have been an increased interest on
the hybrid of FL and SL known as the SplitFed Learning (SFL). This research is
the earliest attempt to study, analyze and present the impact of data poisoning
attacks in SFL. We propose three kinds of novel attack strategies namely
untargeted, targeted and distance-based attacks for SFL. All the attacks
strategies aim to degrade the performance of the DCML-based classifier. We test
the proposed attack strategies for two different case studies on
Electrocardiogram signal classification and automatic handwritten digit
recognition. A series of attack experiments were conducted by varying the
percentage of malicious clients and the choice of the model split layer between
the clients and the server. The results after the comprehensive analysis of
attack strategies clearly convey that untargeted and distance-based poisoning
attacks have greater impacts in evading the classifier outcomes compared to
targeted attacks in SF
A New Implementation of Federated Learning for Privacy and Security Enhancement
Motivated by the ever-increasing concerns on personal data privacy and the
rapidly growing data volume at local clients, federated learning (FL) has
emerged as a new machine learning setting. An FL system is comprised of a
central parameter server and multiple local clients. It keeps data at local
clients and learns a centralized model by sharing the model parameters learned
locally. No local data needs to be shared, and privacy can be well protected.
Nevertheless, since it is the model instead of the raw data that is shared, the
system can be exposed to the poisoning model attacks launched by malicious
clients. Furthermore, it is challenging to identify malicious clients since no
local client data is available on the server. Besides, membership inference
attacks can still be performed by using the uploaded model to estimate the
client's local data, leading to privacy disclosure. In this work, we first
propose a model update based federated averaging algorithm to defend against
Byzantine attacks such as additive noise attacks and sign-flipping attacks. The
individual client model initialization method is presented to provide further
privacy protections from the membership inference attacks by hiding the
individual local machine learning model. When combining these two schemes,
privacy and security can be both effectively enhanced. The proposed schemes are
proved to converge experimentally under non-IID data distribution when there
are no attacks. Under Byzantine attacks, the proposed schemes perform much
better than the classical model based FedAvg algorithm
- …