1 research outputs found
Robust Classification using Robust Feature Augmentation
Existing deep neural networks, say for image classification, have been shown
to be vulnerable to adversarial images that can cause a DNN misclassification,
without any perceptible change to an image. In this work, we propose shock
absorbing robust features such as binarization, e.g., rounding, and group
extraction, e.g., color or shape, to augment the classification pipeline,
resulting in more robust classifiers. Experimentally, we show that augmenting
ML models with these techniques leads to improved overall robustness on
adversarial inputs as well as significant improvements in training time. On the
MNIST dataset, we achieved 14x speedup in training time to obtain 90%
adversarial accuracy com-pared to the state-of-the-art adversarial training
method of Madry et al., as well as retained higher adversarial accuracy over a
broader range of attacks. We also find robustness improvements on traffic sign
classification using robust feature augmentation. Finally, we give theoretical
insights for why one can expect robust feature augmentation to reduce
adversarial input spac