15,662 research outputs found
Knowledge-Aware Federated Active Learning with Non-IID Data
Federated learning enables multiple decentralized clients to learn
collaboratively without sharing the local training data. However, the expensive
annotation cost to acquire data labels on local clients remains an obstacle in
utilizing local data. In this paper, we propose a federated active learning
paradigm to efficiently learn a global model with limited annotation budget
while protecting data privacy in a decentralized learning way. The main
challenge faced by federated active learning is the mismatch between the active
sampling goal of the global model on the server and that of the asynchronous
local clients. This becomes even more significant when data is distributed
non-IID across local clients. To address the aforementioned challenge, we
propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of
Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory
Federated Update (KCFU). KSAS is a novel active sampling method tailored for
the federated active learning problem. It deals with the mismatch challenge by
sampling actively based on the discrepancies between local and global models.
KSAS intensifies specialized knowledge in local clients, ensuring the sampled
data to be informative for both the local clients and the global model. KCFU,
in the meantime, deals with the client heterogeneity caused by limited data and
non-IID data distributions. It compensates for each client's ability in weak
classes by the assistance of the global model. Extensive experiments and
analyses are conducted to show the superiority of KSAS over the
state-of-the-art active learning methods and the efficiency of KCFU under the
federated active learning framework.Comment: 14 pages, 12 figure
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning
Deep neural networks are susceptible to various inference attacks as they
remember information about their training data. We design white-box inference
attacks to perform a comprehensive privacy analysis of deep learning models. We
measure the privacy leakage through parameters of fully trained models as well
as the parameter updates of models during training. We design inference
algorithms for both centralized and federated learning, with respect to passive
and active inference attackers, and assuming different adversary prior
knowledge.
We evaluate our novel white-box membership inference attacks against deep
learning algorithms to trace their training data records. We show that a
straightforward extension of the known black-box attacks to the white-box
setting (through analyzing the outputs of activation functions) is ineffective.
We therefore design new algorithms tailored to the white-box setting by
exploiting the privacy vulnerabilities of the stochastic gradient descent
algorithm, which is the algorithm used to train deep neural networks. We
investigate the reasons why deep learning models may leak information about
their training data. We then show that even well-generalized models are
significantly susceptible to white-box membership inference attacks, by
analyzing state-of-the-art pre-trained and publicly available models for the
CIFAR dataset. We also show how adversarial participants, in the federated
learning setting, can successfully run active membership inference attacks
against other participants, even when the global model achieves high prediction
accuracies.Comment: 2019 IEEE Symposium on Security and Privacy (SP
On the (In)security of Peer-to-Peer Decentralized Machine Learning
In this work, we carry out the first, in-depth, privacy analysis of
Decentralized Learning -- a collaborative machine learning framework aimed at
addressing the main limitations of federated learning. We introduce a suite of
novel attacks for both passive and active decentralized adversaries. We
demonstrate that, contrary to what is claimed by decentralized learning
proposers, decentralized learning does not offer any security advantage over
federated learning. Rather, it increases the attack surface enabling any user
in the system to perform privacy attacks such as gradient inversion, and even
gain full control over honest users' local model. We also show that, given the
state of the art in protections, privacy-preserving configurations of
decentralized learning require fully connected networks, losing any practical
advantage over the federated setup and therefore completely defeating the
objective of the decentralized approach.Comment: IEEE S&P'23 (Previous title: "On the Privacy of Decentralized Machine
Learning"
Federated Unlearning via Active Forgetting
The increasing concerns regarding the privacy of machine learning models have
catalyzed the exploration of machine unlearning, i.e., a process that removes
the influence of training data on machine learning models. This concern also
arises in the realm of federated learning, prompting researchers to address the
federated unlearning problem. However, federated unlearning remains
challenging. Existing unlearning methods can be broadly categorized into two
approaches, i.e., exact unlearning and approximate unlearning. Firstly,
implementing exact unlearning, which typically relies on the
partition-aggregation framework, in a distributed manner does not improve time
efficiency theoretically. Secondly, existing federated (approximate) unlearning
methods suffer from imprecise data influence estimation, significant
computational burden, or both. To this end, we propose a novel federated
unlearning framework based on incremental learning, which is independent of
specific models and federated settings. Our framework differs from existing
federated unlearning methods that rely on approximate retraining or data
influence estimation. Instead, we leverage new memories to overwrite old ones,
imitating the process of \textit{active forgetting} in neurology. Specifically,
the model, intended to unlearn, serves as a student model that continuously
learns from randomly initiated teacher models. To preserve catastrophic
forgetting of non-target data, we utilize elastic weight consolidation to
elastically constrain weight change. Extensive experiments on three benchmark
datasets demonstrate the efficiency and effectiveness of our proposed method.
The result of backdoor attacks demonstrates that our proposed method achieves
satisfying completeness
Robustness Analytics to Data Heterogeneity in Edge Computing
Federated Learning is a framework that jointly trains a model \textit{with}
complete knowledge on a remotely placed centralized server, but
\textit{without} the requirement of accessing the data stored in distributed
machines. Some work assumes that the data generated from edge devices are
identically and independently sampled from a common population distribution.
However, such ideal sampling may not be realistic in many contexts. Also,
models based on intrinsic agency, such as active sampling schemes, may lead to
highly biased sampling. So an imminent question is how robust Federated
Learning is to biased sampling? In this
work\footnote{\url{https://github.com/jiaqian/robustness_of_FL}}, we
experimentally investigate two such scenarios. First, we study a centralized
classifier aggregated from a collection of local classifiers trained with data
having categorical heterogeneity. Second, we study a classifier aggregated from
a collection of local classifiers trained by data through active sampling at
the edge. We present evidence in both scenarios that Federated Learning is
robust to data heterogeneity when local training iterations and communication
frequency are appropriately chosen
- …