We propose a novel multi-bit box-free watermarking method for the protection
of Intellectual Property Rights (IPR) of GANs with improved robustness against
white-box attacks like fine-tuning, pruning, quantization, and surrogate model
attacks. The watermark is embedded by adding an extra watermarking loss term
during GAN training, ensuring that the images generated by the GAN contain an
invisible watermark that can be retrieved by a pre-trained watermark decoder.
In order to improve the robustness against white-box model-level attacks, we
make sure that the model converges to a wide flat minimum of the watermarking
loss term, in such a way that any modification of the model parameters does not
erase the watermark. To do so, we add random noise vectors to the parameters of
the generator and require that the watermarking loss term is as invariant as
possible with respect to the presence of noise. This procedure forces the
generator to converge to a wide flat minimum of the watermarking loss. The
proposed method is architectureand dataset-agnostic, thus being applicable to
many different generation tasks and models, as well as to CNN-based image
processing architectures. We present the results of extensive experiments
showing that the presence of the watermark has a negligible impact on the
quality of the generated images, and proving the superior robustness of the
watermark against model modification and surrogate model attacks