How can multiple distributed entities collaboratively train a shared deep net
on their private data while preserving privacy? This paper introduces
InstaHide, a simple encryption of training images, which can be plugged into
existing distributed deep learning pipelines. The encryption is efficient and
applying it during training has minor effect on test accuracy.
InstaHide encrypts each training image with a "one-time secret key" which
consists of mixing a number of randomly chosen images and applying a random
pixel-wise mask. Other contributions of this paper include: (a) Using a large
public dataset (e.g. ImageNet) for mixing during its encryption, which improves
security. (b) Experimental results to show effectiveness in preserving privacy
against known attacks with only minor effects on accuracy. (c) Theoretical
analysis showing that successfully attacking privacy requires attackers to
solve a difficult computational problem. (d) Demonstrating that use of the
pixel-wise mask is important for security, since Mixup alone is shown to be
insecure to some some efficient attacks. (e) Release of a challenge dataset
https://github.com/Hazelsuko07/InstaHide_Challenge
Our code is available at https://github.com/Hazelsuko07/InstaHideComment: ICML 202