177 research outputs found
Adversarial Training for Free!
Adversarial training, in which a network is trained on adversarial examples,
is one of the few defenses against adversarial attacks that withstands strong
attacks. Unfortunately, the high cost of generating strong adversarial examples
makes standard adversarial training impractical on large-scale problems like
ImageNet. We present an algorithm that eliminates the overhead cost of
generating adversarial examples by recycling the gradient information computed
when updating model parameters. Our "free" adversarial training algorithm
achieves comparable robustness to PGD adversarial training on the CIFAR-10 and
CIFAR-100 datasets at negligible additional cost compared to natural training,
and can be 7 to 30 times faster than other strong adversarial training methods.
Using a single workstation with 4 P100 GPUs and 2 days of runtime, we can train
a robust model for the large-scale ImageNet classification task that maintains
40% accuracy against PGD attacks. The code is available at
https://github.com/ashafahi/free_adv_train.Comment: Accepted to NeurIPS 201
- …