In spite of achieving revolutionary successes in machine learning, deep
convolutional neural networks have been recently found to be vulnerable to
adversarial attacks and difficult to generalize to novel test images with
reasonably large geometric transformations. Inspired by a recent neuroscience
discovery revealing that primate brain employs disentangled shape and
appearance representations for object recognition, we propose a general
disentangled deep autoencoding regularization framework that can be easily
applied to any deep embedding based classification model for improving the
robustness of deep neural networks. Our framework effectively learns
disentangled appearance code and geometric code for robust image
classification, which is the first disentangling based method defending against
adversarial attacks and complementary to standard defense methods. Extensive
experiments on several benchmark datasets show that, our proposed
regularization framework leveraging disentangled embedding significantly
outperforms traditional unregularized convolutional neural networks for image
classification on robustness against adversarial attacks and generalization to
novel test data.Comment: 9 page