Deep learning models have achieved state-of-the-art performances in various
domains, while they are vulnerable to the inputs with well-crafted but small
perturbations, which are named after adversarial examples (AEs). Among many
strategies to improve the model robustness against AEs, Projected Gradient
Descent (PGD) based adversarial training is one of the most effective methods.
Unfortunately, the prohibitive computational overhead of generating strong
enough AEs, due to the maximization of the loss function, sometimes makes the
regular PGD adversarial training impractical when using larger and more
complicated models. In this paper, we propose that the adversarial loss can be
approximated by the partial sum of Taylor series. Furthermore, we approximate
the gradient of adversarial loss and propose a new and efficient adversarial
training method, adversarial training with gradient approximation (GAAT), to
reduce the cost of building up robust models. Additionally, extensive
experiments demonstrate that this efficiency improvement can be achieved
without any or with very little loss in accuracy on natural and adversarial
examples, which show that our proposed method saves up to 60\% of the training
time with comparable model test accuracy on MNIST, CIFAR-10 and CIFAR-100
datasets.Comment: The experiments are insufficient, later will be updated. Withraw this
manuscrip