Deep neural networks (DNNs) have demonstrated their superiority in practice.
Arguably, the rapid development of DNNs is largely benefited from high-quality
(open-sourced) datasets, based on which researchers and developers can easily
evaluate and improve their learning methods. Since the data collection is
usually time-consuming or even expensive, how to protect their copyrights is of
great significance and worth further exploration. In this paper, we revisit
dataset ownership verification. We find that existing verification methods
introduced new security risks in DNNs trained on the protected dataset, due to
the targeted nature of poison-only backdoor watermarks. To alleviate this
problem, in this work, we explore the untargeted backdoor watermarking scheme,
where the abnormal model behaviors are not deterministic. Specifically, we
introduce two dispersibilities and prove their correlation, based on which we
design the untargeted backdoor watermark under both poisoned-label and
clean-label settings. We also discuss how to use the proposed untargeted
backdoor watermark for dataset ownership verification. Experiments on benchmark
datasets verify the effectiveness of our methods and their resistance to
existing backdoor defenses. Our codes are available at
\url{https://github.com/THUYimingLi/Untargeted_Backdoor_Watermark}.Comment: This work is accepted by the NeurIPS 2022 (Oral, TOP 2%). The first
two authors contributed equally to this work. 25 pages. We have fixed some
typos in the previous versio