research article review journal article
Robust shortcut and disordered robustness: Improving adversarial training through adaptive smoothing
Abstract
peer reviewedDeep neural networks are highly susceptible to adversarial perturbations: artificial noise that corrupts input data in ways imperceptible to humans but causes incorrect predictions. Among the various defenses against these attacks, adversarial training has emerged as the most effective. In this work, we aim to enhance adversarial training to improve robustness against adversarial attacks. We begin by analyzing how adversarial vulnerability evolves during training from an instance-wise perspective. This analysis reveals two previously unrecognized phenomena: robust shortcut and disordered robustness. We then demonstrate that these phenomena are related to robust overfitting, a well-known issue in adversarial training. Building on these insights, we propose a novel adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training (ISEAT). This method jointly smooths the input and weight loss landscapes in an instance-adaptive manner, preventing the exploitation of robust shortcut and thereby mitigating robust overfitting. Extensive experiments demonstrate the efficacy of ISEAT and its superiority over existing adversarial training methods. Code is available at https://github.com/TreeLLi/ISEAT- journal article
- http://purl.org/coar/resource_type/c_6501
- info:eu-repo/semantics/article
- peer reviewed
- Adversarial robustness
- Adversarial training
- Instance adaptive
- Loss smoothing
- Neural-networks
- Overfitting
- Training methods
- Computer Vision and Pattern Recognition
- Artificial Intelligence
- Engineering, computing & technology
- Computer science
- Ingénierie, informatique & technologie
- Sciences informatiques