Recently, the membership inference attack poses a serious threat to the
privacy of confidential training data of machine learning models. This paper
proposes a novel adversarial example based privacy-preserving technique
(AEPPT), which adds the crafted adversarial perturbations to the prediction of
the target model to mislead the adversary's membership inference model. The
added adversarial perturbations do not affect the accuracy of target model, but
can prevent the adversary from inferring whether a specific data is in the
training set of the target model. Since AEPPT only modifies the original output
of the target model, the proposed method is general and does not require
modifying or retraining the target model. Experimental results show that the
proposed method can reduce the inference accuracy and precision of the
membership inference model to 50%, which is close to a random guess. Further,
for those adaptive attacks where the adversary knows the defense mechanism, the
proposed AEPPT is also demonstrated to be effective. Compared with the
state-of-the-art defense methods, the proposed defense can significantly
degrade the accuracy and precision of membership inference attacks to 50%
(i.e., the same as a random guess) while the performance and utility of the
target model will not be affected