5,253 research outputs found

    Improving the robustness of neural networks using K-support norm based adversarial training

    Get PDF
    It is of significant importance for any classification and recognition system, which claims near or better than human performance to be immune to small perturbations in the dataset. Researchers found out that neural networks are not very robust to small perturbations and can easily be fooled to persistently misclassify by adding a particular class of noise in the test data. This, so-called adversarial noise severely deteriorates the performance of neural networks, which otherwise perform really well on unperturbed dataset. It has been recently proposed that neural networks can be made robust against adversarial noise by training them using the data corrupted with adversarial noise itself. Following this approach, in this paper, we propose a new mechanism to generate a powerful adversarial noise model based on K-support norm to train neural networks. We tested our approach on two benchmark datasets, namely the MNIST and STL-10, using muti-layer perceptron and convolutional neural networks. Experimental results demonstrate that neural networks trained with the proposed technique show significant improvement in robustness as compared to state-of-the-art techniques

    A Kernel Perspective for Regularizing Deep Neural Networks

    Get PDF
    We propose a new point of view for regularizing deep neural networks by using the norm of a reproducing kernel Hilbert space (RKHS). Even though this norm cannot be computed, it admits upper and lower approximations leading to various practical strategies. Specifically, this perspective (i) provides a common umbrella for many existing regularization principles, including spectral norm and gradient penalties, or adversarial training, (ii) leads to new effective regularization penalties, and (iii) suggests hybrid strategies combining lower and upper bounds to get better approximations of the RKHS norm. We experimentally show this approach to be effective when learning on small datasets, or to obtain adversarially robust models.Comment: ICM

    Interpreting Adversarially Trained Convolutional Neural Networks

    Full text link
    We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation. We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and images under different transformations. The comparison could visually show that the prediction of the two types of CNNs is sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.Comment: To apper in ICML1
    • …
    corecore