Methods for Detection and Recovery of Out-of-Distribution Examples

Abstract

Deep neural networks currently comprise the backbone of many applications where safety is a critical concern, for example: autonomous driving and medical diagnostics. Unfortunately these systems currently fail to detect out-of-distribution (OOD) inputs and can be prone to making dangerous errors when exposed to them. In addition, these same systems are vulnerable to maliciously altered inputs called adversarial examples. In response to these problems we present two methods to handle out-of-distribution inputs, as well resist adversarial examples, respectively. \\ To detect OOD inputs, we introduce HyperGAN: a generative adversarial network which learns to generate all the parameters of a deep neural network. HyperGAN first transforms low dimensional noise into a latent space, which can be sampled from to obtain diverse, performant sets of parameters for a target architecture. By sampling many sets of parameters, we form a diverse ensemble which provides a better estimate of uncertainty than standard ensembles. We show that HyperGAN can reliably detect OOD inputs as well as adversarial examples.\\ We also present a method for recovering clean images from adversarial examples. BFNet uses a differentiable bilateral filter as a preprocessor to a neural network. The bilateral filter projects inputs back to the space of natural images, and in doing so it removes the adversarial perturbation. We show that BFNet is an effective defense in multiple attack settings, and is able to provide additional robustness when combined with other defenses

    Similar works