1 research outputs found
FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering
Federated learning (FL) is becoming a key component in many technology-based
applications including language modeling -- where individual FL participants
often have privacy-sensitive text data in their local datasets. However,
realizing the extent of privacy leakage in federated language models is not
straightforward and the existing attacks only intend to extract data regardless
of how sensitive or naive it is. To fill this gap, in this paper, we introduce
two novel findings with regard to leaking privacy-sensitive user data from
federated language models. Firstly, we make a key observation that model
snapshots from the intermediate rounds in FL can cause greater privacy leakage
than the final trained model. Secondly, we identify that privacy leakage can be
aggravated by tampering with a model's selective weights that are specifically
responsible for memorizing the sensitive training data. We show how a malicious
client can leak the privacy-sensitive data of some other user in FL even
without any cooperation from the server. Our best-performing method improves
the membership inference recall by 29% and achieves up to 70% private data
reconstruction, evidently outperforming existing attacks with stronger
assumptions of adversary capabilities.Comment: 22 pages (including bibliography and Appendix), Submitted to USENIX
Security '2