1 research outputs found
Defending Against Adversarial Attacks Using Random Forests
As deep neural networks (DNNs) have become increasingly important and
popular, the robustness of DNNs is the key to the safety of both the Internet
and the physical world. Unfortunately, some recent studies show that
adversarial examples, which are hard to be distinguished from real examples,
can easily fool DNNs and manipulate their predictions. Upon observing that
adversarial examples are mostly generated by gradient-based methods, in this
paper, we first propose to use a simple yet very effective non-differentiable
hybrid model that combines DNNs and random forests, rather than hide gradients
from attackers, to defend against the attacks. Our experiments show that our
model can successfully and completely defend the white-box attacks, has a lower
transferability, and is quite resistant to three representative types of
black-box attacks; while at the same time, our model achieves similar
classification accuracy as the original DNNs. Finally, we investigate and
suggest a criterion to define where to grow random forests in DNNs