As the use of machine learning continues to expand, the importance of
ensuring its safety cannot be overstated. A key concern in this regard is the
ability to identify whether a given sample is from the training distribution,
or is an "Out-Of-Distribution" (OOD) sample. In addition, adversaries can
manipulate OOD samples in ways that lead a classifier to make a confident
prediction. In this study, we present a novel approach for certifying the
robustness of OOD detection within a ℓ2​-norm around the input, regardless
of network architecture and without the need for specific components or
additional training. Further, we improve current techniques for detecting
adversarial attacks on OOD samples, while providing high levels of certified
and adversarial robustness on in-distribution samples. The average of all OOD
detection metrics on CIFAR10/100 shows an increase of ∼13%/5%
relative to previous approaches