We study the consistency of surrogate risks for robust binary classification.
It is common to learn robust classifiers by adversarial training, which seeks
to minimize the expected 0-1 loss when each example can be maliciously
corrupted within a small ball. We give a simple and complete characterization
of the set of surrogate loss functions that are \emph{consistent}, i.e., that
can replace the 0-1 loss without affecting the minimizing sequences of the
original adversarial risk, for any data distribution. We also prove a
quantitative version of adversarial consistency for the ρ-margin loss. Our
results reveal that the class of adversarially consistent surrogates is
substantially smaller than in the standard setting, where many common
surrogates are known to be consistent.Comment: 15 pages, submitted to NeurIps. arXiv admin note: text overlap with
arXiv:2206.0909