Regional adversarial attacks often rely on complicated methods for generating
adversarial perturbations, making it hard to compare their efficacy against
well-known attacks. In this study, we show that effective regional
perturbations can be generated without resorting to complex methods. We develop
a very simple regional adversarial perturbation attack method using
cross-entropy sign, one of the most commonly used losses in adversarial machine
learning. Our experiments on ImageNet with multiple models reveal that, on
average, 76% of the generated adversarial examples maintain model-to-model
transferability when the perturbation is applied to local image regions.
Depending on the selected region, these localized adversarial examples require
significantly less Lp​ norm distortion (for p∈{0,2,∞})
compared to their non-local counterparts. These localized attacks therefore
have the potential to undermine defenses that claim robustness under the
aforementioned norms.Comment: Accepted for the ICML 2020, Workshop on Uncertainty and Robustness in
Deep Learning (UDL