32 research outputs found
Pre-training also Transfers Non-Robustness
Pre-training has enabled state-of-the-art results on many tasks. In spite of
its recognized contribution to generalization, we observed in this study that
pre-training also transfers adversarial non-robustness from pre-trained model
into fine-tuned model in the downstream tasks. Using image classification as an
example, we first conducted experiments on various datasets and network
backbones to uncover the adversarial non-robustness in fine-tuned model.
Further analysis was conducted on examining the learned knowledge of fine-tuned
model and standard model, and revealed that the reason leading to the
non-robustness is the non-robust features transferred from pre-trained model.
Finally, we analyzed the preference for feature learning of the pre-trained
model, explored the factors influencing robustness, and introduced a simple
robust pre-traning solution
Adversarial Machine Learning in the Wild
Deep neural networks are making their way into our everyday lives at an increasing rate. While the adoption of these models has greatly improved our everyday lives, it has also opened the door to new vulnerabilities in real-world systems. More specifically, in the scope of this work we are interested in one class of vulnerabilities: adversarial attacks. Given the high importance and the sensitivity of some of the tasks these models are responsible for, it is crucial to study such vulnerabilities in real-world systems. In this work, we look at examples of deep neural network-based real-world systems, vulnerabilities of such systems, and approaches for making such systems more robust.
First, we study an example of leveraging a deep neural network in a business-critical real-world system. We discuss how deep neural networks improve the quality of smart voice assistants. More specifically, we introduce how collaborative filtering models can automatically detect and resolve the errors of a voice assistant. We then discuss the success of this approach in improving the quality of a real-world voice assistant.
Second, we demonstrate a proof of concept for an adversarial attack against content-based recommendation systems which are commonly used in real-world settings. We discuss how malicious actors can add unnoticeable perturbations to the content they upload to the website to achieve their preferred outcomes. We also show how adversarial training can render such attacks useless.
Third, we discuss another example of how adversarial attacks can be leveraged to manipulate a real-world system. We study how adversarial attacks can successfully manipulate YouTube's copyright detection model and the financial implications of this vulnerability. In particular, we show how adversarial examples created for a copyright detection model that we implemented transfer to another black-box model.
Finally, we study the problem of transfer learning in an adversarially robust setting. We discuss how robust models contain robust feature extractors and how we can leverage them to train new classifiers that preserve the robustness of the original model. We then study the case of fine-tuning in the target domain while preserving the robustness. We show the success of our proposed solutions in preserving the robustness in the target domain
Automated Synthetic-to-Real Generalization
Models trained on synthetic images often face degraded generalization to real
data. As a convention, these models are often initialized with ImageNet
pre-trained representation. Yet the role of ImageNet knowledge is seldom
discussed despite common practices that leverage this knowledge to maintain the
generalization ability. An example is the careful hand-tuning of early stopping
and layer-wise learning rates, which is shown to improve synthetic-to-real
generalization but is also laborious and heuristic. In this work, we explicitly
encourage the synthetically trained model to maintain similar representations
with the ImageNet pre-trained model, and propose a \textit{learning-to-optimize
(L2O)} strategy to automate the selection of layer-wise learning rates. We
demonstrate that the proposed framework can significantly improve the
synthetic-to-real generalization performance without seeing and training on
real data, while also benefiting downstream tasks such as domain adaptation.
Code is available at: https://github.com/NVlabs/ASG.Comment: Accepted to ICML 202
Automated Synthetic-to-Real Generalization
Models trained on synthetic images often face degraded generalization to real data. As a convention, these models are often initialized with ImageNet pre-trained representation. Yet the role of ImageNet knowledge is seldom discussed despite common practices that leverage this knowledge to maintain the generalization ability. An example is the careful hand-tuning of early stopping and layer-wise learning rates, which is shown to improve synthetic-to-real generalization but is also laborious and heuristic. In this work, we explicitly encourage the synthetically trained model to maintain similar representations with the ImageNet pre-trained model, and propose a \textit{learning-to-optimize (L2O)} strategy to automate the selection of layer-wise learning rates. We demonstrate that the proposed framework can significantly improve the synthetic-to-real generalization performance without seeing and training on real data, while also benefiting downstream tasks such as domain adaptation. Code is available at: this https URL https://github.com/NVlabs/ASG
Adversarial Training Reduces Information and Improves Transferability
Recent results show that features of adversarially trained networks for
classification, in addition to being robust, enable desirable properties such
as invertibility. The latter property may seem counter-intuitive as it is
widely accepted by the community that classification models should only capture
the minimal information (features) required for the task. Motivated by this
discrepancy, we investigate the dual relationship between Adversarial Training
and Information Theory. We show that the Adversarial Training can improve
linear transferability to new tasks, from which arises a new trade-off between
transferability of representations and accuracy on the source task. We validate
our results employing robust networks trained on CIFAR-10, CIFAR-100 and
ImageNet on several datasets. Moreover, we show that Adversarial Training
reduces Fisher information of representations about the input and of the
weights about the task, and we provide a theoretical argument which explains
the invertibility of deterministic networks without violating the principle of
minimality. Finally, we leverage our theoretical insights to remarkably improve
the quality of reconstructed images through inversion