9,923 research outputs found

    FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System

    Full text link
    Federated Learning trains machine learning models on distributed devices by aggregating local model updates instead of local data. However, privacy concerns arise as the aggregated local models on the server may reveal sensitive personal information by inversion attacks. Privacy-preserving methods, such as homomorphic encryption (HE), then become necessary for FL training. Despite HE's privacy advantages, its applications suffer from impractical overheads, especially for foundation models. In this paper, we present FedML-HE, the first practical federated learning system with efficient HE-based secure model aggregation. FedML-HE proposes to selectively encrypt sensitive parameters, significantly reducing both computation and communication overheads during training while providing customizable privacy preservation. Our optimized system demonstrates considerable overhead reduction, particularly for large foundation models (e.g., ~10x reduction for ResNet-50, and up to ~40x reduction for BERT), demonstrating the potential for scalable HE-based FL deployment

    An Application of Secure Data Aggregation for Privacy-Preserving Machine Learning on Mobile Devices

    Get PDF
    Machine learning algorithms over big data have been widely used to make low-priced services better over the years, but they come with privacy as a major public concern. The European Union has made the General Data Protection Regulation (GDPR) enforceable recently, and the GDPR mainly focuses on giving citizens and residents more control over their personal data. On the other hand, with personal and collective data from users, companies can provide better experience for customers like customized news feeds and real time transportation systems. To solve this dilemma, many privacy-preserving schemes have been proposed such as homomorphic encryption and machine learning over encrypted data. However, many of them are not practical for the time being due to the high com- putational complexity. In 2017, Bonawitz et al. proposed a practical scheme for secure data aggregation from privacy-preserving machine learning, which comes with the afford- able calculation and communication complexity that considers practical users’ drop-out situations. However, the communication complexity of the scheme is not efficient enough because a mobile user needs to communicate with all the members in the network to es- tablish a secure mutual key with each other. In this thesis, by combining the Harn-Gong key establishment protocol and the mobile data aggregation scheme, we propose an efficient mobile data aggregation protocol with privacy-preserving by introducing a non-interactive key establishment protocol which re- duces the communication complexity for pairwise key establishment of n users from O(n2) to a constant value. We correct the security proof of Harn-Gong key establishment protocol and provide a secure threshold of degree of polynomial according to Byzantine Problem. We implement KDC side Harn-Gong key establishment primitives and prepare a proof-of- concept Android mobile application to test our protocol’s running time in masking private data. The result shows that our private data masking time is 1.5 to 3 times faster than the original one

    Efficient Dropout-resilient Aggregation for Privacy-preserving Machine Learning

    Full text link
    With the increasing adoption of data-hungry machine learning algorithms, personal data privacy has emerged as one of the key concerns that could hinder the success of digital transformation. As such, Privacy-Preserving Machine Learning (PPML) has received much attention from both academia and industry. However, organizations are faced with the dilemma that, on the one hand, they are encouraged to share data to enhance ML performance, but on the other hand, they could potentially be breaching the relevant data privacy regulations. Practical PPML typically allows multiple participants to individually train their ML models, which are then aggregated to construct a global model in a privacy-preserving manner, e.g., based on multi-party computation or homomorphic encryption. Nevertheless, in most important applications of large-scale PPML, e.g., by aggregating clients' gradients to update a global model for federated learning, such as consumer behavior modeling of mobile application services, some participants are inevitably resource-constrained mobile devices, which may drop out of the PPML system due to their mobility nature. Therefore, the resilience of privacy-preserving aggregation has become an important problem to be tackled. In this paper, we propose a scalable privacy-preserving aggregation scheme that can tolerate dropout by participants at any time, and is secure against both semi-honest and active malicious adversaries by setting proper system parameters. By replacing communication-intensive building blocks with a seed homomorphic pseudo-random generator, and relying on the additive homomorphic property of Shamir secret sharing scheme, our scheme outperforms state-of-the-art schemes by up to 6.37Ă—\times in runtime and provides a stronger dropout-resilience. The simplicity of our scheme makes it attractive both for implementation and for further improvements.Comment: 16 pages, 5 figures. Accepted by IEEE Transactions on Information Forensics and Securit

    Towards Privacy-Preserving and Verifiable Federated Matrix Factorization

    Full text link
    Recent years have witnessed the rapid growth of federated learning (FL), an emerging privacy-aware machine learning paradigm that allows collaborative learning over isolated datasets distributed across multiple participants. The salient feature of FL is that the participants can keep their private datasets local and only share model updates. Very recently, some research efforts have been initiated to explore the applicability of FL for matrix factorization (MF), a prevalent method used in modern recommendation systems and services. It has been shown that sharing the gradient updates in federated MF entails privacy risks on revealing users' personal ratings, posing a demand for protecting the shared gradients. Prior art is limited in that they incur notable accuracy loss, or rely on heavy cryptosystem, with a weak threat model assumed. In this paper, we propose VPFedMF, a new design aimed at privacy-preserving and verifiable federated MF. VPFedMF provides for federated MF guarantees on the confidentiality of individual gradient updates through lightweight and secure aggregation. Moreover, VPFedMF ambitiously and newly supports correctness verification of the aggregation results produced by the coordinating server in federated MF. Experiments on a real-world moving rating dataset demonstrate the practical performance of VPFedMF in terms of computation, communication, and accuracy
    • …
    corecore