Attacking and Defending Printer Source Attribution Classifiers in the Physical Domain

Abstract

The security of machine learning classifiers has received increasing attention in the last years. In forensic applications, guaranteeing the security of the tools investigators rely on is crucial, since the gathered evidence may be used to decide about the innocence or the guilt of a suspect. Several adversarial attacks were proposed to assess such security, with a few works focusing on transferring such attacks from the digital to the physical domain. In this work, we focus on physical domain attacks against source attribution of printed documents. We first show how a simple reprinting attack may be sufficient to fool a model trained on images that were printed and scanned only once. Then, we propose a hardened version of the classifier trained on the reprinted attacked images. Finally, we attack the hardened classifier with several attacks, including a new attack based on the Expectation Over Transformation approach, which finds the adversarial perturbations by simulating the physical transformations occurring when the image attacked in the digital domain is printed again. The results we got demonstrate a good capability of the hardened classifier to resist attacks carried out in the physical domai

    Similar works