Deep Neural Networks (DNNs) have been shown to be vulnerable against
adversarial examples, which are data points cleverly constructed to fool the
classifier. Such attacks can be devastating in practice, especially as DNNs are
being applied to ever increasing critical tasks like image recognition in
autonomous driving. In this paper, we introduce a new perspective on the
problem. We do so by first defining robustness of a classifier to adversarial
exploitation. Next, we show that the problem of adversarial example generation
can be posed as learning problem. We also categorize attacks in literature into
high and low perturbation attacks; well-known attacks like fast-gradient sign
method (FGSM) and our attack produce higher perturbation adversarial examples
while the more potent but computationally inefficient Carlini-Wagner (CW)
attack is low perturbation. Next, we show that the dual approach of the attack
learning problem can be used as a defensive technique that is effective against
high perturbation attacks. Finally, we show that a classifier masking method
achieved by adding noise to the a neural network's logit output protects
against low distortion attacks such as the CW attack. We also show that both
our learning and masking defense can work simultaneously to protect against
multiple attacks. We demonstrate the efficacy of our techniques by
experimenting with the MNIST and CIFAR-10 datasets