150 research outputs found
Just Fine-tune Twice: Selective Differential Privacy for Large Language Models
With the increasing adoption of NLP models in real-world products, it becomes
more and more important to protect these models from privacy leakage. Because
private information in language data is sparse, previous research formalized a
Selective-Differential-Privacy (SDP) notion to provide protection for sensitive
tokens detected by policy functions, and prove its effectiveness on RNN-based
models. But the previous mechanism requires separating the private and public
model parameters and thus cannot be applied on large attention-based models. In
this paper, we propose a simple yet effective just-fine-tune-twice privacy
mechanism to first fine-tune on in-domain redacted data and then on in-domain
private data, to achieve SDP for large Transformer-based language models. We
also design explicit and contextual policy functions to provide protections at
different levels. Experiments show that our models achieve strong performance
while staying robust to the canary insertion attack. We further show that even
under low-resource settings with a small amount of in-domain data, SDP can
still improve the model utility. We will release the code, data and models to
facilitate future research
Selective Differential Privacy for Language Modeling
With the increasing applications of language models, it has become crucial to
protect these models from leaking private information. Previous work has
attempted to tackle this challenge by training RNN-based language models with
differential privacy guarantees. However, applying classical differential
privacy to language models leads to poor model performance as the underlying
privacy notion is over-pessimistic and provides undifferentiated protection for
all tokens in the data. Given that the private information in natural language
is sparse (for example, the bulk of an email might not carry personally
identifiable information), we propose a new privacy notion, selective
differential privacy, to provide rigorous privacy guarantees on the sensitive
portion of the data to improve model utility. To realize such a new notion, we
develop a corresponding privacy mechanism, Selective-DPSGD, for RNN-based
language models. Besides language modeling, we also apply the method to a more
concrete application--dialog systems. Experiments on both language modeling and
dialog system building show that the proposed privacy-preserving mechanism
achieves better utilities while remaining safe under various privacy attacks
compared to the baselines. The data and code are released at
https://github.com/wyshi/lm_privacy to facilitate future research .Comment: NAACL 202
- …