Existing defense methods against adversarial attacks can be categorized into
training time and test time defenses. Training time defense, i.e., adversarial
training, requires a significant amount of extra time for training and is often
not able to be generalized to unseen attacks. On the other hand, test time
defense by test time weight adaptation requires access to perform gradient
descent on (part of) the model weights, which could be infeasible for models
with frozen weights. To address these challenges, we propose DRAM, a novel
defense method to Detect and Reconstruct multiple types of Adversarial attacks
via Masked autoencoder (MAE). We demonstrate how to use MAE losses to build a
KS-test to detect adversarial attacks. Moreover, the MAE losses can be used to
repair adversarial samples from unseen attack types. In this sense, DRAM
neither requires model weight updates in test time nor augments the training
set with more adversarial samples. Evaluating DRAM on the large-scale ImageNet
data, we achieve the best detection rate of 82% on average on eight types of
adversarial attacks compared with other detection baselines. For
reconstruction, DRAM improves the robust accuracy by 6% ~ 41% for Standard
ResNet50 and 3% ~ 8% for Robust ResNet50 compared with other self-supervision
tasks, such as rotation prediction and contrastive learning