Improving the Transferability of Adversarial Attacks on Face Recognition with Beneficial Perturbation Feature Augmentation

Abstract

Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. To improve the transferability of adversarial face examples, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of adversarial examples to surrogate FR models by constantly generating new models that have the similar effect of hard samples to craft the adversarial examples. Specifically, in the backpropagation, BPFA records the gradients on pre-selected features and uses the gradient on the input image to craft the adversarial example. In the next forward propagation, BPFA leverages the recorded gradients to add perturbations (i.e., beneficial perturbations) that can be pitted against the adversarial example on their corresponding features. The optimization process of the adversarial example and the optimization process of the beneficial perturbations added on the features correspond to a minimax two-player game. Extensive experiments demonstrate that BPFA can significantly boost the transferability of adversarial attacks on FR

    Similar works

    Full text

    thumbnail-image

    Available Versions