266 research outputs found
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models
Deep neural networks (DNNs) are one of the most prominent technologies of our
time, as they achieve state-of-the-art performance in many machine learning
tasks, including but not limited to image classification, text mining, and
speech processing. However, recent research on DNNs has indicated
ever-increasing concern on the robustness to adversarial examples, especially
for security-critical tasks such as traffic sign identification for autonomous
driving. Studies have unveiled the vulnerability of a well-trained DNN by
demonstrating the ability of generating barely noticeable (to both human and
machines) adversarial images that lead to misclassification. Furthermore,
researchers have shown that these adversarial images are highly transferable by
simply training and attacking a substitute model built upon the target model,
known as a black-box attack to DNNs.
Similar to the setting of training substitute models, in this paper we
propose an effective black-box attack that also only has access to the input
(images) and the output (confidence scores) of a targeted DNN. However,
different from leveraging attack transferability from substitute models, we
propose zeroth order optimization (ZOO) based attacks to directly estimate the
gradients of the targeted DNN for generating adversarial examples. We use
zeroth order stochastic coordinate descent along with dimension reduction,
hierarchical attack and importance sampling techniques to efficiently attack
black-box models. By exploiting zeroth order optimization, improved attacks to
the targeted DNN can be accomplished, sparing the need for training substitute
models and avoiding the loss in attack transferability. Experimental results on
MNIST, CIFAR10 and ImageNet show that the proposed ZOO attack is as effective
as the state-of-the-art white-box attack and significantly outperforms existing
black-box attacks via substitute models.Comment: Accepted by 10th ACM Workshop on Artificial Intelligence and Security
(AISEC) with the 24th ACM Conference on Computer and Communications Security
(CCS
OTJR: Optimal Transport Meets Optimal Jacobian Regularization for Adversarial Robustness
The Web, as a rich medium of diverse content, has been constantly under the
threat of malicious entities exploiting its vulnerabilities, especially with
the rapid proliferation of deep learning applications in various web services.
One such vulnerability, crucial to the fidelity and integrity of web content,
is the susceptibility of deep neural networks to adversarial perturbations,
especially concerning images - a dominant form of data on the web. In light of
the recent advancements in the robustness of classifiers, we delve deep into
the intricacies of adversarial training (AT) and Jacobian regularization, two
pivotal defenses. Our work {is the} first carefully analyzes and characterizes
these two schools of approaches, both theoretically and empirically, to
demonstrate how each approach impacts the robust learning of a classifier.
Next, we propose our novel Optimal Transport with Jacobian regularization
method, dubbed~\SystemName, jointly incorporating the input-output Jacobian
regularization into the AT by leveraging the optimal transport theory. In
particular, we employ the Sliced Wasserstein (SW) distance that can efficiently
push the adversarial samples' representations closer to those of clean samples,
regardless of the number of classes within the dataset. The SW distance
provides the adversarial samples' movement directions, which are much more
informative and powerful for the Jacobian regularization. Our empirical
evaluations set a new standard in the domain, with our method achieving
commendable accuracies of 51.41\% on the ~\CIFAR-10 and 28.49\% on the
~\CIFAR-100 datasets under the AutoAttack metric. In a real-world
demonstration, we subject images sourced from the Internet to online
adversarial attacks, reinforcing the efficacy and relevance of our model in
defending against sophisticated web-image perturbations
- …