10 research outputs found

    A survey on vulnerability of federated learning: A learning algorithm perspective

    Get PDF
    Federated Learning (FL) has emerged as a powerful paradigm for training Machine Learning (ML), particularly Deep Learning (DL) models on multiple devices or servers while maintaining data localized at owners’ sites. Without centralizing data, FL holds promise for scenarios where data integrity, privacy and security and are critical. However, this decentralized training process also opens up new avenues for opponents to launch unique attacks, where it has been becoming an urgent need to understand the vulnerabilities and corresponding defense mechanisms from a learning algorithm perspective. This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications. The categorized bibliography can be found at: https://github.com/Rand2AI/Awesome-Vulnerability-of-Federated-Learning

    Privacy Preserving Text Recognition with Gradient-Boosting for Federated Learning

    Get PDF
    Typical machine learning approaches require centralized data for model training, which may not be possible where restrictions on data sharing are in place due to, for instance, privacy protection. The recently proposed Federated Learning (FL) frame-work allows learning a shared model collaboratively without data being centralized or data sharing among data owners. However, we show in this paper that the generalization ability of the joint model is poor on Non-Independent and Non-Identically Dis-tributed (Non-IID) data, particularly when the Federated Averaging (FedAvg) strategy is used in this collaborative learning framework thanks to the weight divergence phenomenon. We propose a novel boosting algorithm for FL to address this generalisation issue, as well as achieving much faster convergence in gradient based optimization. We demonstrate our Federated Boosting (FedBoost) method on privacy-preserved text recognition, which shows significant improvements in both performance and efficiency. The text images are based on publicly available datasets for fair comparison and we intend to make our implementation public to ensure reproducibility.Comment: The paper has been submitted to BMVC2020 on April 30t

    An Element-Wise Weights Aggregation Method for Federated Learning

    Get PDF
    Federated learning (FL) is a powerful Machine Learning (ML) paradigm that enables distributed clients to collaboratively learn a shared global model while keeping the data on the original device, thereby preserving privacy. A central challenge in FL is the effective aggregation of local model weights from disparate and potentially unbalanced participating clients. Existing methods often treat each client indiscriminately, applying a single proportion to the entire local model. However, it is empirically advantageous for each weight to be assigned a specific proportion. This paper introduces an innovative Element-Wise Weights Aggregation Method for Federated Learning (EWWA-FL) aimed at optimizing learning performance and accelerating convergence speed. Unlike traditional FL approaches, EWWA-FL aggregates local weights to the global model at the level of individual elements, thereby allowing each participating client to make element-wise contributions to the learning process. By taking into account the unique dataset characteristics of each client, EWWA-FL enhances the robustness of the global model to different datasets while also achieving rapid convergence. The method is flexible enough to employ various weighting strategies. Through comprehensive experiments, we demonstrate the advanced capabilities of EWWA-FL, showing significant improvements in both accuracy and convergence speed across a range of backbones and benchmarks

    A Real-Time and Long-Term Face Tracking Method Using Convolutional Neural Network and Optical Flow in IoT-Based Multimedia Communication Systems

    Get PDF
    The development of the Internet of Things (IoT) stimulates many research works related to Multimedia Communication Systems (MCS), such as human face detection and tracking. This trend drives numerous progressive methods. Among these methods, the deep learning-based methods can spot face patch in an image effectively and accurately. Many people consider face tracking as face detection, but they are two different techniques. Face detection focuses on a single image, whose shortcoming is obvious, such as unstable and unsmooth face position when adopted on a sequence of continuous images; computing is expensive due to its heavy reliance on Convolutional Neural Networks (CNNs) and limited detection performance on the edge device. To overcome these defects, this paper proposes a novel face tracking strategy by combining CNN and optical flow, namely, C-OF, which achieves an extremely fast, stable, and long-term face tracking system. Two key things for commercial applications are the stability and smoothness of face positions in a sequence of image frames, which can provide more probability for face biological signal extraction, silent face antispoofing, and facial expression analysis in the fields of IoT-based MCS. Our method captures face patterns in every two consequent frames via optical flow to get rid of the unstable and unsmooth problems. Moreover, an innovative metric for measuring the stability and smoothness of face motion is designed and adopted in our experiments. The experimental results illustrate that our proposed C-OF outperforms both face detection and object tracking methods

    Gradient Leakage and Protection for Federated Learning

    No full text
    In recent years, data privacy has become a critical issue in the field of Machine Learning (ML),given the significant amount of sensitive data involved in training and inference processes.Several approaches have been developed to address this challenge, including cryptographyand collaborative training. Cryptography techniques, such as Homomorphic Encryption (HE)and Differential Privacy (DP), have gained popularity due to their ability to protect sensitivedata during computation. HE allows computations to be performed directly on encrypted datawithout the need to decrypt it, thus ensuring privacy while still providing accurate results. Onthe other hand, DP adds random noise to data to protect individuals’ privacy while preservingstatistical accuracy. Collaborative training methods, such as Secure Multi-Party Computation(MPC), Distributed Learning, and Federated Learning (FL), aim to address privacy concernsby enabling secure local computation. In MPC, parties collaborate to compute a functionwithout revealing their inputs to each other, making it suitable for privacy-preserving ML tasks.Distributed Learning allows data to be distributed across multiple devices or nodes, reducingthe risk of data breaches while still achieving accurate results. FL enables the training of MLmodels on decentralised data without transferring raw data to a central location. While thesetechniques have proven effective in protecting sensitive data, they also have some limitations.For instance, HE and DP may be computationally expensive, which can hinder their widespreadadoption. Additionally, collaborative training methods may require significant communicationoverhead and synchronisation, which can affect training efficiency.Collaborative training through gradient exchange has been widely used in Deep Learning(DL) as a secure way to train a robust model. However, recent research has shown that thismethod may not be entirely secure. In fact, sensitive information can be recovered fromthe shared gradient, compromising privacy and leading to malicious actors’ potential theft ofvaluable data. Various studies have demonstrated that the publicly shared gradient can revealsensitive information about the training data, such as the presence of specific individuals orproperties. This can lead to significant privacy breaches, especially in sensitive areas such as healthcare or finance. As the demand for privacy-preserving ML grows, there is a need forfurther research and development of effective and robust techniques to ensure data privacyduring collaborative training.This thesis aims to investigate how to reconstruct private input data from the publiclyshared gradient and how to prevent gradient leakage in terms of gradient-sharing protocol anda private key-lock module. We first show that in an FL system, image-based privacy datacan be easily retrieved from the shared gradient through our proposed Generative RegressionNeural Network (GRNN). Our attack involves formulating the problem as a regression taskand optimising two branches of the generative model by minimising the gradient distance.The findings of our study demonstrate that even seemingly innocuous shared information canlead to the recovery of sensitive data. This highlights the importance of developing robustprivacy-preserving techniques to protect sensitive information during collaborative ML. Ourproposed GRNN attack serves as a wake-up call to the ML community to address the privacyconcerns associated with FL.Our following study found that the generalisation ability of joint models in FL is pooron Non-Independent and Non-Identically Distributed (Non-IID) data, particularly when theFederated Averaging (FedAvg) strategy is used, leading to weight divergence. To address thisissue, we propose a novel boosting algorithm for FL that addresses the generalisation andgradient leakage problems, resulting in faster convergence in gradient-based optimisation. Ourproposed boosting algorithm aims to improve the performance of FL models by aggregatingmodels trained on subsets of data, addressing the weight divergence issue. The algorithmleverages an adaptive weighting strategy, where the weights of each model are adjusted basedon their performance, with models that perform better receiving more weight. Additionally, weintroduce a privacy-preserving component to the algorithm, where local models are encrypted toreduce the risk of gradient leakage. Our proposed boosting algorithm shows promising resultsin addressing FL’s generalisation and gradient leakage issues, leading to faster convergence ingradient-based optimisation. The findings of our study highlight the importance of developingrobust techniques to improve the performance of FL models and ensure data privacy duringcollaborative ML.At last, our research proposes a new approach to defending against gradient leakage attacksin FL through a private key-lock module (FedKL). This method involves securing arbitrary modelarchitectures with a private key-lock module, where only the locked gradient is transferredto the parameter server for aggregating the global model. The proposed FedKL method isdesigned to be robust against gradient leakage attacks, ensuring that sensitive informationcannot be reconstructed from the shared gradient. The key-lock module is trained in a waythat, without the private information of the module, it becomes infeasible to reconstruct trainingdata from the shared gradient. Furthermore, the inference performance of the global model issignificantly undermined without the key-lock module, making it an integral part of the modelarchitecture. Our theoretical analysis explains why the gradient can leak private informationand how the proposed FedKL method defends against the attack based on our analysis. Theproposed FedKL method provides a new perspective on defending against gradient leakageattacks in FL, enhancing the security and privacy of sensitive data. We will continuously work on the privacy-preserving FL. In our previous work, we have identified a number of follow-up research point. Examples include gradient leakage for Natural Language Processing (NLP), an adaptive gradient aggregation method and partial gradient leakage. Since we have theoretically proven that the private information is carried by the gradients, so finding the state-of-the-art methods of stealing data and defending against leakage is a long-term study in safeguarding privacy

    GRNN: generative regression neural network—a data leakage attack for federated learning

    No full text
    Data privacy has become an increasingly important issue in Machine Learning (ML), where many approaches have been developed to tackle this challenge, e.g. cryptography (Homomorphic Encryption (HE), Differential Privacy (DP), etc.) and collaborative training (Secure Multi-Party Computation (MPC), Distributed Learning and Federated Learning (FL)). These techniques have a particular focus on data encryption or secure local computation. They transfer the intermediate information to the third party to compute the final result. Gradient exchanging is commonly considered to be a secure way of training a robust model collaboratively in Deep Learning (DL). However, recent researches have demonstrated that sensitive information can be recovered from the shared gradient. Generative Adversarial Network (GAN), in particular, has shown to be effective in recovering such information. However, GAN based techniques require additional information, such as class labels which are generally unavailable for privacy-preserved learning. In this paper, we show that, in the FL system, image-based privacy data can be easily recovered in full from the shared gradient only via our proposed Generative Regression Neural Network (GRNN). We formulate the attack to be a regression problem and optimize two branches of the generative model by minimizing the distance between gradients. We evaluate our method on several image classification tasks. The results illustrate that our proposed GRNN outperforms state-of-the-art methods with better stability, stronger robustness, and higher accuracy. It also has no convergence requirement to the global FL model. Moreover, we demonstrate information leakage using face re-identification. Some defense strategies are also discussed in this work

    Amino Derivatives of PEEK-WC

    No full text
    The synthetic procedure and the characterization of the new amino derivatives of poly(oxa-pphenylene- 3,3-phtalido-p-phenylene-oxa-p-phenilene-oxyphenylene) (PEEK-WC) with various average degrees of substitution, is reported. The amino PEEK-WC was extensively characterised by using Fourier transform infrared (FTIR) spectroscopy, thermogravimetric analysis, differential scanning calorimeter, scanning electron microscopy, Elemental analyses, NMR, and viscosity measurements. The amino PEEK-WC shows different solubility in some solvents in comparison with the parent polymer, good thermal stability and is able to form membrane by means of the phase inversion technique. Amino PEEK-WC results to be quite reactive and can lead to further modificatio
    corecore