99 research outputs found
CRS-FL: Conditional Random Sampling for Communication-Efficient and Privacy-Preserving Federated Learning
Federated Learning (FL), a privacy-oriented distributed ML paradigm, is being
gaining great interest in Internet of Things because of its capability to
protect participants data privacy. Studies have been conducted to address
challenges existing in standard FL, including communication efficiency and
privacy-preserving. But they cannot achieve the goal of making a tradeoff
between communication efficiency and model accuracy while guaranteeing privacy.
This paper proposes a Conditional Random Sampling (CRS) method and implements
it into the standard FL settings (CRS-FL) to tackle the above-mentioned
challenges. CRS explores a stochastic coefficient based on Poisson sampling to
achieve a higher probability of obtaining zero-gradient unbiasedly, and then
decreases the communication overhead effectively without model accuracy
degradation. Moreover, we dig out the relaxation Local Differential Privacy
(LDP) guarantee conditions of CRS theoretically. Extensive experiment results
indicate that (1) in communication efficiency, CRS-FL performs better than the
existing methods in metric accuracy per transmission byte without model
accuracy reduction in more than 7% sampling ratio (# sampling size / # model
size); (2) in privacy-preserving, CRS-FL achieves no accuracy reduction
compared with LDP baselines while holding the efficiency, even exceeding them
in model accuracy under more sampling ratio conditions
PA-iMFL: Communication-Efficient Privacy Amplification Method against Data Reconstruction Attack in Improved Multi-Layer Federated Learning
Recently, big data has seen explosive growth in the Internet of Things (IoT).
Multi-layer FL (MFL) based on cloud-edge-end architecture can promote model
training efficiency and model accuracy while preserving IoT data privacy. This
paper considers an improved MFL, where edge layer devices own private data and
can join the training process. iMFL can improve edge resource utilization and
also alleviate the strict requirement of end devices, but suffers from the
issues of Data Reconstruction Attack (DRA) and unacceptable communication
overhead. This paper aims to address these issues with iMFL. We propose a
Privacy Amplification scheme on iMFL (PA-iMFL). Differing from standard MFL, we
design privacy operations in end and edge devices after local training,
including three sequential components, local differential privacy with Laplace
mechanism, privacy amplification subsample, and gradient sign reset.
Benefitting from privacy operations, PA-iMFL reduces communication overhead and
achieves privacy-preserving. Extensive results demonstrate that against
State-Of-The-Art (SOTA) DRAs, PA-iMFL can effectively mitigate private data
leakage and reach the same level of protection capability as the SOTA defense
model. Moreover, due to adopting privacy operations in edge devices, PA-iMFL
promotes up to 2.8 times communication efficiency than the SOTA compression
method without compromising model accuracy.Comment: 12 pages, 11 figure
Breaking the Communication-Privacy-Accuracy Tradeoff with -Differential Privacy
We consider a federated data analytics problem in which a server coordinates
the collaborative data analysis of multiple users with privacy concerns and
limited communication capability. The commonly adopted compression schemes
introduce information loss into local data while improving communication
efficiency, and it remains an open problem whether such discrete-valued
mechanisms provide any privacy protection. In this paper, we study the local
differential privacy guarantees of discrete-valued mechanisms with finite
output space through the lens of -differential privacy (DP). More
specifically, we advance the existing literature by deriving tight -DP
guarantees for a variety of discrete-valued mechanisms, including the binomial
noise and the binomial mechanisms that are proposed for privacy preservation,
and the sign-based methods that are proposed for data compression, in
closed-form expressions. We further investigate the amplification in privacy by
sparsification and propose a ternary stochastic compressor. By leveraging
compression for privacy amplification, we improve the existing methods by
removing the dependency of accuracy (in terms of mean square error) on
communication cost in the popular use case of distributed mean estimation,
therefore breaking the three-way tradeoff between privacy, communication, and
accuracy. Finally, we discuss the Byzantine resilience of the proposed
mechanism and its application in federated learning
The Interpolated MVU Mechanism For Communication-efficient Private Federated Learning
We consider private federated learning (FL), where a server aggregates
differentially private gradient updates from a large number of clients in order
to train a machine learning model. The main challenge is balancing privacy with
both classification accuracy of the learned model as well as the amount of
communication between the clients and server. In this work, we build on a
recently proposed method for communication-efficient private FL -- the MVU
mechanism -- by introducing a new interpolation mechanism that can accommodate
a more efficient privacy analysis. The result is the new Interpolated MVU
mechanism that provides SOTA results on communication-efficient private FL on a
variety of datasets
The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation
We consider training models on private data that is distributed across user
devices. To ensure privacy, we add on-device noise and use secure aggregation
so that only the noisy sum is revealed to the server. We present a
comprehensive end-to-end system, which appropriately discretizes the data and
adds discrete Gaussian noise before performing secure aggregation. We provide a
novel privacy analysis for sums of discrete Gaussians. We also analyze the
effect of rounding the input data and the modular summation arithmetic. Our
theoretical guarantees highlight the complex tension between communication,
privacy, and accuracy. Our extensive experimental results demonstrate that our
solution is essentially able to achieve a comparable accuracy to central
differential privacy with 16 bits of precision per value
Fed-Safe: Securing Federated Learning in Healthcare Against Adversarial Attacks
This paper explores the security aspects of federated learning applications
in medical image analysis. Current robustness-oriented methods like adversarial
training, secure aggregation, and homomorphic encryption often risk privacy
compromises. The central aim is to defend the network against potential privacy
breaches while maintaining model robustness against adversarial manipulations.
We show that incorporating distributed noise, grounded in the privacy
guarantees in federated settings, enables the development of a adversarially
robust model that also meets federated privacy standards. We conducted
comprehensive evaluations across diverse attack scenarios, parameters, and use
cases in cancer imaging, concentrating on pathology, meningioma, and glioma.
The results reveal that the incorporation of distributed noise allows for the
attainment of security levels comparable to those of conventional adversarial
training while requiring fewer retraining samples to establish a robust model
Differentially Private Heavy Hitter Detection using Federated Analytics
In this work, we study practical heuristics to improve the performance of
prefix-tree based algorithms for differentially private heavy hitter detection.
Our model assumes each user has multiple data points and the goal is to learn
as many of the most frequent data points as possible across all users' data
with aggregate and local differential privacy. We propose an adaptive
hyperparameter tuning algorithm that improves the performance of the algorithm
while satisfying computational, communication and privacy constraints. We
explore the impact of different data-selection schemes as well as the impact of
introducing deny lists during multiple runs of the algorithm. We test these
improvements using extensive experimentation on the Reddit
dataset~\cite{caldas2018leaf} on the task of learning the most frequent words
Private Federated Learning with Autotuned Compression
We propose new techniques for reducing communication in private federated
learning without the need for setting or tuning compression rates. Our
on-the-fly methods automatically adjust the compression rate based on the error
induced during training, while maintaining provable privacy guarantees through
the use of secure aggregation and differential privacy. Our techniques are
provably instance-optimal for mean estimation, meaning that they can adapt to
the ``hardness of the problem" with minimal interactivity. We demonstrate the
effectiveness of our approach on real-world datasets by achieving favorable
compression rates without the need for tuning.Comment: Accepted to ICML 202
- …