15,662 research outputs found

    Knowledge-Aware Federated Active Learning with Non-IID Data

    Full text link
    Federated learning enables multiple decentralized clients to learn collaboratively without sharing the local training data. However, the expensive annotation cost to acquire data labels on local clients remains an obstacle in utilizing local data. In this paper, we propose a federated active learning paradigm to efficiently learn a global model with limited annotation budget while protecting data privacy in a decentralized learning way. The main challenge faced by federated active learning is the mismatch between the active sampling goal of the global model on the server and that of the asynchronous local clients. This becomes even more significant when data is distributed non-IID across local clients. To address the aforementioned challenge, we propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory Federated Update (KCFU). KSAS is a novel active sampling method tailored for the federated active learning problem. It deals with the mismatch challenge by sampling actively based on the discrepancies between local and global models. KSAS intensifies specialized knowledge in local clients, ensuring the sampled data to be informative for both the local clients and the global model. KCFU, in the meantime, deals with the client heterogeneity caused by limited data and non-IID data distributions. It compensates for each client's ability in weak classes by the assistance of the global model. Extensive experiments and analyses are conducted to show the superiority of KSAS over the state-of-the-art active learning methods and the efficiency of KCFU under the federated active learning framework.Comment: 14 pages, 12 figure

    Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

    Full text link
    Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.Comment: 2019 IEEE Symposium on Security and Privacy (SP

    On the (In)security of Peer-to-Peer Decentralized Machine Learning

    Full text link
    In this work, we carry out the first, in-depth, privacy analysis of Decentralized Learning -- a collaborative machine learning framework aimed at addressing the main limitations of federated learning. We introduce a suite of novel attacks for both passive and active decentralized adversaries. We demonstrate that, contrary to what is claimed by decentralized learning proposers, decentralized learning does not offer any security advantage over federated learning. Rather, it increases the attack surface enabling any user in the system to perform privacy attacks such as gradient inversion, and even gain full control over honest users' local model. We also show that, given the state of the art in protections, privacy-preserving configurations of decentralized learning require fully connected networks, losing any practical advantage over the federated setup and therefore completely defeating the objective of the decentralized approach.Comment: IEEE S&P'23 (Previous title: "On the Privacy of Decentralized Machine Learning"

    Federated Unlearning via Active Forgetting

    Full text link
    The increasing concerns regarding the privacy of machine learning models have catalyzed the exploration of machine unlearning, i.e., a process that removes the influence of training data on machine learning models. This concern also arises in the realm of federated learning, prompting researchers to address the federated unlearning problem. However, federated unlearning remains challenging. Existing unlearning methods can be broadly categorized into two approaches, i.e., exact unlearning and approximate unlearning. Firstly, implementing exact unlearning, which typically relies on the partition-aggregation framework, in a distributed manner does not improve time efficiency theoretically. Secondly, existing federated (approximate) unlearning methods suffer from imprecise data influence estimation, significant computational burden, or both. To this end, we propose a novel federated unlearning framework based on incremental learning, which is independent of specific models and federated settings. Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation. Instead, we leverage new memories to overwrite old ones, imitating the process of \textit{active forgetting} in neurology. Specifically, the model, intended to unlearn, serves as a student model that continuously learns from randomly initiated teacher models. To preserve catastrophic forgetting of non-target data, we utilize elastic weight consolidation to elastically constrain weight change. Extensive experiments on three benchmark datasets demonstrate the efficiency and effectiveness of our proposed method. The result of backdoor attacks demonstrates that our proposed method achieves satisfying completeness

    Robustness Analytics to Data Heterogeneity in Edge Computing

    Get PDF
    Federated Learning is a framework that jointly trains a model \textit{with} complete knowledge on a remotely placed centralized server, but \textit{without} the requirement of accessing the data stored in distributed machines. Some work assumes that the data generated from edge devices are identically and independently sampled from a common population distribution. However, such ideal sampling may not be realistic in many contexts. Also, models based on intrinsic agency, such as active sampling schemes, may lead to highly biased sampling. So an imminent question is how robust Federated Learning is to biased sampling? In this work\footnote{\url{https://github.com/jiaqian/robustness_of_FL}}, we experimentally investigate two such scenarios. First, we study a centralized classifier aggregated from a collection of local classifiers trained with data having categorical heterogeneity. Second, we study a classifier aggregated from a collection of local classifiers trained by data through active sampling at the edge. We present evidence in both scenarios that Federated Learning is robust to data heterogeneity when local training iterations and communication frequency are appropriately chosen
    • …
    corecore