Diffusion models have been remarkably successful in data synthesis. Such
successes have also driven diffusion models to apply to sensitive data, such as
human face data, but this might bring about severe privacy concerns. In this
work, we systematically present the first privacy study about property
inference attacks against diffusion models, in which adversaries aim to extract
sensitive global properties of the training set from a diffusion model, such as
the proportion of the training data for certain sensitive properties.
Specifically, we consider the most practical attack scenario: adversaries are
only allowed to obtain synthetic data. Under this realistic scenario, we
evaluate the property inference attacks on different types of samplers and
diffusion models. A broad range of evaluations shows that various diffusion
models and their samplers are all vulnerable to property inference attacks.
Furthermore, one case study on off-the-shelf pre-trained diffusion models also
demonstrates the effectiveness of the attack in practice. Finally, we propose a
new model-agnostic plug-in method PriSampler to mitigate the property inference
of diffusion models. PriSampler can be directly applied to well-trained
diffusion models and support both stochastic and deterministic sampling.
Extensive experiments illustrate the effectiveness of our defense and it makes
adversaries infer the proportion of properties as close as random guesses.
PriSampler also shows its significantly superior performance to diffusion
models trained with differential privacy on both model utility and defense
performance