9 research outputs found

    The art of defense: letting networks fool the attacker

    Full text link
    Some deep neural networks are invariant to some input transformations, such as Pointnet is permutation invariant to the input point cloud. In this paper, we demonstrated this property could be powerful in defense of gradient-based attacks. Specifically, we apply random input transformation which is invariant to the networks we want to defend. Extensive experiments demonstrate that the proposed scheme defeats various gradient-based attackers in the targeted attack setting, and breaking the attack accuracy into nearly zero. Our code is available at: {\footnotesize{\url{https://github.com/cuge1995/IT-Defense}}}

    Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

    Get PDF
    This work addresses adversarial robustness in deep learning by considering deep networks with stochastic local winner-takes-all (LWTA) activations. This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates a non-zero output. The main operating principle of the introduced units lies on stochastic arguments, as the network performs posterior sampling over competing units to select the winner. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to allow for inferring the sub-part of each layer that is essential for modeling the data at hand. Then, inference is performed by means of stochastic variational Bayes. We perform a thorough experimental evaluation of our model using benchmark datasets. As we show, our method achieves high robustness to adversarial perturbations, with state-of-the-art performance in powerful adversarial attack schemes.Comment: Accepted AISTATS 2021. arXiv admin note: text overlap with arXiv:2006.1062

    Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples

    Get PDF
    Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and CVPR. These defenses are mainly focused on mitigating white-box attacks. They do not properly examine black-box attacks. In this paper, we expand upon the analysis of these defenses to include adaptive black-box adversaries. Our evaluation is done on nine defenses including Barrage of Random Transforms, ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer Zones. Our investigation is done using two black-box adversarial models and six widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our analyses show most recent defenses (7 out of 9) provide only marginal improvements in security (<25%<25\%), as compared to undefended networks. For every defense, we also show the relationship between the amount of data the adversary has at their disposal, and the effectiveness of adaptive black-box attacks. Overall, our results paint a clear picture: defenses need both thorough white-box and black-box analyses to be considered secure. We provide this large scale study and analyses to motivate the field to move towards the development of more robust black-box defenses
    corecore