189 research outputs found
You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle
Deep learning achieves state-of-the-art results in many tasks in computer
vision and natural language processing. However, recent works have shown that
deep networks can be vulnerable to adversarial perturbations, which raised a
serious robustness issue of deep networks. Adversarial training, typically
formulated as a robust optimization problem, is an effective way of improving
the robustness of deep networks. A major drawback of existing adversarial
training algorithms is the computational overhead of the generation of
adversarial examples, typically far greater than that of the network training.
This leads to the unbearable overall computational cost of adversarial
training. In this paper, we show that adversarial training can be cast as a
discrete time differential game. Through analyzing the Pontryagin's Maximal
Principle (PMP) of the problem, we observe that the adversary update is only
coupled with the parameters of the first layer of the network. This inspires us
to restrict most of the forward and back propagation within the first layer of
the network during adversary updates. This effectively reduces the total number
of full forward and backward propagation to only one for each group of
adversary updates. Therefore, we refer to this algorithm YOPO (You Only
Propagate Once). Numerical experiments demonstrate that YOPO can achieve
comparable defense accuracy with approximately 1/5 ~ 1/4 GPU time of the
projected gradient descent (PGD) algorithm. Our codes are available at
https://https://github.com/a1600012888/YOPO-You-Only-Propagate-Once.Comment: Accepted as a conference paper at NeurIPS 201
Dynamic Neural Network is All You Need: Understanding the Robustness of Dynamic Mechanisms in Neural Networks
Deep Neural Networks (DNNs) have been used to solve different day-to-day
problems. Recently, DNNs have been deployed in real-time systems, and lowering
the energy consumption and response time has become the need of the hour. To
address this scenario, researchers have proposed incorporating dynamic
mechanism to static DNNs (SDNN) to create Dynamic Neural Networks (DyNNs)
performing dynamic amounts of computation based on the input complexity.
Although incorporating dynamic mechanism into SDNNs would be preferable in
real-time systems, it also becomes important to evaluate how the introduction
of dynamic mechanism impacts the robustness of the models. However, there has
not been a significant number of works focusing on the robustness trade-off
between SDNNs and DyNNs. To address this issue, we propose to investigate the
robustness of dynamic mechanism in DyNNs and how dynamic mechanism design
impacts the robustness of DyNNs. For that purpose, we evaluate three research
questions. These evaluations are performed on three models and two datasets.
Through the studies, we find that attack transferability from DyNNs to SDNNs is
higher than attack transferability from SDNNs to DyNNs. Also, we find that
DyNNs can be used to generate adversarial samples more efficiently than SDNNs.
Then, through research studies, we provide insight into the design choices that
can increase robustness of DyNNs against the attack generated using static
model. Finally, we propose a novel attack to understand the additional attack
surface introduced by the dynamic mechanism and provide design choices to
improve robustness against the attack
Towards Debugging and Improving Adversarial Robustness Evaluations
Despite exhibiting unprecedented success in many application domains, machine‐learning models have been shown to be vulnerable to adversarial examples, i.e., maliciously perturbed inputs that are able to subvert their predictions at test time. Rigorous testing against such perturbations requires enumerating all possible outputs for all possible inputs, and despite impressive results in this field, these methods remain still difficult to scale to modern deep learning systems. For these reasons, empirical methods are often used. These adversarial perturbations are optimized via gradient descent, minimizing a loss function that aims to increase the probability of misleading the model’s predictions. To understand the sensitivity of the model to such attacks, and to counter the effects, machine-learning model designers craft worst-case adversarial perturbations and test them against the model they are evaluating. However, many of the proposed defenses have been shown to provide a false sense of security due to failures of the attacks, rather than actual improvements in the machine‐learning models’ robustness. They have been broken indeed under more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic and automated manner. To this end, we tackle three different challenges: (1) we investigate how adversarial robustness evaluations can be performed efficiently, by proposing a novel attack that can be used to find minimum-norm adversarial perturbations; (2) we propose a framework for debugging adversarial robustness evaluations, by defining metrics that reveal faulty evaluations as well as mitigations to patch the detected problems; and (3) we show how to employ a surrogate model for improving the success of transfer-based attacks, that are useful when gradient-based attacks are failing due to problems in the gradient information. To improve the quality of robustness evaluations, we propose a novel attack, referred to as Fast Minimum‐Norm (FMN) attack, which competes with state‐of‐the‐art attacks in terms of quality of the solution while outperforming them in terms of computational complexity and robustness to sub‐optimal configurations of the attack hyperparameters. These are all desirable characteristics of attacks used in robustness evaluations, as the aforementioned problems often arise from the use of sub‐optimal attack hyperparameters, including, e.g., the number of attack iterations, the step size, and the use of an inappropriate loss function. The correct refinement of these variables is often neglected, hence we designed a novel framework that helps debug the optimization process of adversarial examples, by means of quantitative indicators that unveil common problems and failures during the attack optimization process, e.g., in the configuration of the hyperparameters. Commonly accepted best practices suggest further validating the target model with alternative strategies, among which is the usage of a surrogate model to craft the adversarial examples to transfer to the model being evaluated is useful to check for gradient obfuscation. However, how to effectively create transferable adversarial examples is not an easy process, as many factors influence the success of this strategy. In the context of this research, we utilize a first-order model to show what are the main underlying phenomena that affect transferability and suggest best practices to create adversarial examples that transfer well to the target models.
Robust Evaluation of Diffusion-Based Adversarial Purification
We question the current evaluation practice on diffusion-based purification
methods. Diffusion-based purification methods aim to remove adversarial effects
from an input data point at test time. The approach gains increasing attention
as an alternative to adversarial training due to the disentangling between
training and testing. Well-known white-box attacks are often employed to
measure the robustness of the purification. However, it is unknown whether
these attacks are the most effective for the diffusion-based purification since
the attacks are often tailored for adversarial training. We analyze the current
practices and provide a new guideline for measuring the robustness of
purification methods against adversarial attacks. Based on our analysis, we
further propose a new purification strategy improving robustness compared to
the current diffusion-based purification methods.Comment: Accepted by ICCV 2023, Oral presentatio
- …