Diffusion Art or Digital Forgery? Investigating Data Replication in
  Diffusion Models

Geiping, Jonas; Goldblum, Micah; Goldstein, Tom; Singla, Vasu; Somepalli, Gowthami

Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

Authors: Jonas Geiping
Micah Goldblum
Tom Goldstein
Vasu Singla
Gowthami Somepalli
Publication date: 12 December 2022
Publisher

Abstract

Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication. We also identify cases where diffusion models, including the popular Stable Diffusion model, blatantly copy from their training data.Comment: Updated draft with the following changes (1) Clarified the LAION Aesthetics versions everywhere (2) Correction on which LAION Aesthetics version SD - 1.4 is finetuned on and updated figure 12 based on this (3) A section on possible causes of replicatio

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2212.03860

Last time updated on 08/01/2023