234 research outputs found
Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches
Certifiably robust defenses against adversarial patches for image classifiers
ensure correct prediction against any changes to a constrained neighborhood of
pixels. PatchCleanser arXiv:2108.09135 [cs.CV], the state-of-the-art certified
defense, uses a double-masking strategy for robust classification. The success
of this strategy relies heavily on the model's invariance to image pixel
masking. In this paper, we take a closer look at model training schemes to
improve this invariance. Instead of using Random Cutout arXiv:1708.04552v2
[cs.CV] augmentations like PatchCleanser, we introduce the notion of worst-case
masking, i.e., selecting masked images which maximize classification loss.
However, finding worst-case masks requires an exhaustive search, which might be
prohibitively expensive to do on-the-fly during training. To solve this
problem, we propose a two-round greedy masking strategy (Greedy Cutout) which
finds an approximate worst-case mask location with much less compute. We show
that the models trained with our Greedy Cutout improves certified robust
accuracy over Random Cutout in PatchCleanser across a range of datasets and
architectures. Certified robust accuracy on ImageNet with a ViT-B16-224 model
increases from 58.1\% to 62.3\% against a 3\% square patch applied anywhere on
the image.Comment: 12 pages, 5 figure
- …