Face recognition (FR) models can be easily fooled by adversarial examples,
which are crafted by adding imperceptible perturbations on benign face images.
To improve the transferability of adversarial face examples, we propose a novel
attack method called Beneficial Perturbation Feature Augmentation Attack
(BPFA), which reduces the overfitting of adversarial examples to surrogate FR
models by constantly generating new models that have the similar effect of hard
samples to craft the adversarial examples. Specifically, in the
backpropagation, BPFA records the gradients on pre-selected features and uses
the gradient on the input image to craft the adversarial example. In the next
forward propagation, BPFA leverages the recorded gradients to add perturbations
(i.e., beneficial perturbations) that can be pitted against the adversarial
example on their corresponding features. The optimization process of the
adversarial example and the optimization process of the beneficial
perturbations added on the features correspond to a minimax two-player game.
Extensive experiments demonstrate that BPFA can significantly boost the
transferability of adversarial attacks on FR