6 research outputs found

    FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering

    Full text link
    Federated learning (FL) is becoming a key component in many technology-based applications including language modeling -- where individual FL participants often have privacy-sensitive text data in their local datasets. However, realizing the extent of privacy leakage in federated language models is not straightforward and the existing attacks only intend to extract data regardless of how sensitive or naive it is. To fill this gap, in this paper, we introduce two novel findings with regard to leaking privacy-sensitive user data from federated language models. Firstly, we make a key observation that model snapshots from the intermediate rounds in FL can cause greater privacy leakage than the final trained model. Secondly, we identify that privacy leakage can be aggravated by tampering with a model's selective weights that are specifically responsible for memorizing the sensitive training data. We show how a malicious client can leak the privacy-sensitive data of some other user in FL even without any cooperation from the server. Our best-performing method improves the membership inference recall by 29% and achieves up to 70% private data reconstruction, evidently outperforming existing attacks with stronger assumptions of adversary capabilities.Comment: 22 pages (including bibliography and Appendix), Submitted to USENIX Security '2

    Three Input Exclusive-OR Gate Support For Boyar-Peralta\u27s Algorithm (Extended Version)

    Get PDF
    The linear layer, which is basically a binary non-singular matrix, is an integral part of cipher construction in a lot of private key ciphers. As a result, optimising the linear layer for device implementation has been an important research direction for about two decades. The Boyar-Peralta\u27s algorithm (SEA\u2710) is one such common algorithm, which offers significant improvement compared to the straightforward implementation. This algorithm only returns implementation with XOR2 gates, and is deterministic. Over the last couple of years, some improvements over this algorithm has been proposed, so as to make support for XOR3 gates as well as make it randomised. In this work, we take an already existing improvement (Tan and Peyrin, TCHES\u2720) that allows randomised execution and extend it to support three input XOR gates. This complements the other work done in this direction (Banik et al., IWSEC\u2719) that also supports XOR3 gates with randomised execution. Further, noting from another work (Maximov, Eprint\u2719), we include one additional tie-breaker condition in the original Boyar-Peralta\u27s algorithm. Our work thus collates and extends the state-of-the-art, at the same time offers a simpler interface. We show several results that improve from the lastly best-known results

    New Results on Machine Learning Based Distinguishers

    Get PDF
    Machine Learning (ML) is almost ubiquitously used in multiple disciplines nowadays. Recently, we have seen its usage in the realm of differential distinguishers for symmetric key ciphers. In this work, we explore the possibility of a number of ciphers with respect to various ML-based distinguishers. We show new distinguishers on the unkeyed and round reduced version of SPECK-32, SPECK-128, ASCON, SIMECK-32, SIMECK-64 and SKINNY-128. We explore multiple avenues in the process. In summary, we use neural network as well as support vector machine in various settings (such as varying the activation function), apart from experimenting with a number of input difference tuples. Among other results, we show a distinguisher of 8-round SPECK-32 that works with practical data complexity (most of the experiments take a few hours on a personal computer)

    PROV-FL : privacy-preserving round optimal verifiable federated learning

    No full text
    Federated learning is a distributed framework where a server computes a global model by aggregating the local models trained on users’ private data. However, for a stronger data privacy guarantee, the server should not access the local models except the aggregated one. One way to achieve this is to use a secure aggregation protocol that comes with the cost of several rounds of interactions between the server and users in the absence of a fully trusted third party (TTP). In this paper, we present PROV-FL, an effcient privacy-preserving federated learning training system that securely aggregates users’ local models. PROV-FL requires only one round of communication between the server and users for aggregating local models without a TTP. Based on the homomorphic encryption and differential privacy techniques, we develop two PROV-FL training protocols for two different, namely single and multi-aggregator, scenarios. PROV-FL enjoys the verifiability feature in which the server can verify the authenticity of the aggregated model and effciently handles users’ dynamic joining and leaving. We evaluate and compare the performance of PROV-FL by running experiments on training CNN/DNN models with a diverse set of real-world datasets

    New Results on Machine Learning-Based Distinguishers

    No full text
    Machine Learning (ML) is almost ubiquitously used in multiple disciplines nowadays. Recently, we have seen its usage in the realm of differential distinguishers for symmetric key ciphers. It has been shown that ML-based differential distinguishers can be easily extended to break round-reduced versions of ciphers. In this paper, we show new distinguishers on the unkeyed and round-reduced versions of SPECK-32, SPECK-128, ASCON, SIMECK-32, SIMECK-64, and SKINNY-128. We explore multiple avenues in the process. In summary, we use neural networks and support vector machines in various settings (such as varying the activation function), apart from experimenting with a number of input difference tuples. Among other results, we show a distinguisher of 8-round SPECK-32 that works with low data complexity

    Side Channel Attack On Stream Ciphers: A Three-Step Approach To State/Key Recovery

    No full text
    Side Channel Attack (SCA) exploits the physical information leakage (such as electromagnetic emanation) from a device that performs some cryptographic operation and poses a serious threat in the present IoT era. In the last couple of decades, there have been a large body of research works dedicated to streamlining/improving the attacks or suggesting novel countermeasures to thwart those attacks. However, a closer inspection reveals that a vast majority of published works in the context of symmetric key cryptography is dedicated to block ciphers (or similar designs). This leaves the problem for the stream ciphers wide open. There are few works here and there, but a generic and systematic framework appears to be missing from the literature. Motivating by this observation, we explore the problem of SCA on stream ciphers with extensive details. Loosely speaking, our work picks up from the recent TCHES’21 paper by Sim, Bhasin and Jap. We present a framework by extending the efficiency of their analysis, bringing it into more practical terms.In a nutshell, we develop an automated framework that works as a generic tool to perform SCA on any stream cipher or a similar structure. It combines multiple automated tools (such as, machine learning, mixed integer linear programming, satisfiability modulo theory) under one umbrella, and acts as an end-to-end solution (taking side channel traces and returning the secret key). Our framework efficiently handles noisy data and works even after the cipher reaches its pseudo-random state. We demonstrate its efficacy by taking electromagnetic traces from a 32-bit software platform and performing SCA on a high-profile stream cipher, TRIVIUM, which is also an ISO standard. We show pragmatic key recovery on TRIVIUM during its initialization and also after the cipher reaches its pseudo-random state (i.e., producing key-stream)
    corecore