Deep networks for computer vision are not reliable when they encounter
adversarial examples. In this paper, we introduce a framework that uses the
dense intrinsic constraints in natural images to robustify inference. By
introducing constraints at inference time, we can shift the burden of
robustness from training to the inference algorithm, thereby allowing the model
to adjust dynamically to each individual image's unique and potentially novel
characteristics at inference time. Among different constraints, we find that
equivariance-based constraints are most effective, because they allow dense
constraints in the feature space without overly constraining the representation
at a fine-grained level. Our theoretical results validate the importance of
having such dense constraints at inference time. Our empirical experiments show
that restoring feature equivariance at inference time defends against
worst-case adversarial perturbations. The method obtains improved adversarial
robustness on four datasets (ImageNet, Cityscapes, PASCAL VOC, and MS-COCO) on
image recognition, semantic segmentation, and instance segmentation tasks.
Project page is available at equi4robust.cs.columbia.edu