9 research outputs found
The art of defense: letting networks fool the attacker
Some deep neural networks are invariant to some input transformations, such
as Pointnet is permutation invariant to the input point cloud. In this paper,
we demonstrated this property could be powerful in defense of gradient-based
attacks. Specifically, we apply random input transformation which is invariant
to the networks we want to defend. Extensive experiments demonstrate that the
proposed scheme defeats various gradient-based attackers in the targeted attack
setting, and breaking the attack accuracy into nearly zero. Our code is
available at: {\footnotesize{\url{https://github.com/cuge1995/IT-Defense}}}
Local Competition and Stochasticity for Adversarial Robustness in Deep Learning
This work addresses adversarial robustness in deep learning by considering
deep networks with stochastic local winner-takes-all (LWTA) activations. This
type of network units result in sparse representations from each model layer,
as the units are organized in blocks where only one unit generates a non-zero
output. The main operating principle of the introduced units lies on stochastic
arguments, as the network performs posterior sampling over competing units to
select the winner. We combine these LWTA arguments with tools from the field of
Bayesian non-parametrics, specifically the stick-breaking construction of the
Indian Buffet Process, to allow for inferring the sub-part of each layer that
is essential for modeling the data at hand. Then, inference is performed by
means of stochastic variational Bayes. We perform a thorough experimental
evaluation of our model using benchmark datasets. As we show, our method
achieves high robustness to adversarial perturbations, with state-of-the-art
performance in powerful adversarial attack schemes.Comment: Accepted AISTATS 2021. arXiv admin note: text overlap with
arXiv:2006.1062
Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples
Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and
CVPR. These defenses are mainly focused on mitigating white-box attacks. They
do not properly examine black-box attacks. In this paper, we expand upon the
analysis of these defenses to include adaptive black-box adversaries. Our
evaluation is done on nine defenses including Barrage of Random Transforms,
ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error
Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer
Zones. Our investigation is done using two black-box adversarial models and six
widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our
analyses show most recent defenses (7 out of 9) provide only marginal
improvements in security (), as compared to undefended networks. For
every defense, we also show the relationship between the amount of data the
adversary has at their disposal, and the effectiveness of adaptive black-box
attacks. Overall, our results paint a clear picture: defenses need both
thorough white-box and black-box analyses to be considered secure. We provide
this large scale study and analyses to motivate the field to move towards the
development of more robust black-box defenses