332 research outputs found
Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations
Top- predictions are used in many real-world applications such as machine
learning as a service, recommender systems, and web searches. -norm
adversarial perturbation characterizes an attack that arbitrarily modifies some
features of an input such that a classifier makes an incorrect prediction for
the perturbed input. -norm adversarial perturbation is easy to
interpret and can be implemented in the physical world. Therefore, certifying
robustness of top- predictions against -norm adversarial
perturbation is important. However, existing studies either focused on
certifying -norm robustness of top- predictions or -norm
robustness of top- predictions. In this work, we aim to bridge the gap. Our
approach is based on randomized smoothing, which builds a provably robust
classifier from an arbitrary classifier via randomizing an input. Our major
theoretical contribution is an almost tight -norm certified robustness
guarantee for top- predictions. We empirically evaluate our method on
CIFAR10 and ImageNet. For instance, our method can build a classifier that
achieves a certified top-3 accuracy of 69.2\% on ImageNet when an attacker can
arbitrarily perturb 5 pixels of a testing image
- β¦