Randomized smoothing is the current state-of-the-art method for producing
provably robust classifiers. While randomized smoothing typically yields robust
β2β-ball certificates, recent research has generalized provable robustness
to different norm balls as well as anisotropic regions. This work considers a
classifier architecture that first projects onto a low-dimensional
approximation of the data manifold and then applies a standard classifier. By
performing randomized smoothing in the low-dimensional projected space, we
characterize the certified region of our smoothed composite classifier back in
the high-dimensional input space and prove a tractable lower bound on its
volume. We show experimentally on CIFAR-10 and SVHN that classifiers without
the initial projection are vulnerable to perturbations that are normal to the
data manifold and yet are captured by the certified regions of our method. We
compare the volume of our certified regions against various baselines and show
that our method improves on the state-of-the-art by many orders of magnitude.Comment: Transactions on Machine Learning Research (TMLR) 202