Uncertainty in Deep Learning with Implicit Neural Networks

Abstract

The ability to extract uncertainties from predictions is crucial for the adoption of deep learning systems to safety-critical applications. Uncertainty estimates can be used as a failure signal, which is necessary for automating complex tasks where safety is a concern. Furthermore, current deep learning systems do not provide uncertainty estimates, and instead can assign high probability to incorrect predictions. To mitigate this problem of overconfidence, this dissertation proposes three approaches that leverage the uncertainty within a distribution of models. Specifically, we consider the epistemic uncertainty given by an approximation to the posterior over model parameters. Prior work approximates this posterior by utilizing analytically known distributions, which are inflexible and result in underestimation of the uncertainty. Instead, we propose to use implicit distributions, which are computationally efficient to sample from, and are flexible enough to parameterize a wide range of distributions. The contributions of this thesis show that implicit models enable better uncertainty estimates than prior work, and can be used for open-category prediction, adversarial example detection, and exploration in reinforcement learning. We begin by showing that implicit generative models with feature-space regularization can be used in the open-category setting to detect input distribution shift, while retaining accuracy on training data. Next, we refine our approach by explicitly encouraging diversity within samples with particle-based variational inference. The uncertainty given by these diverse models is used for exploration in reinforcement learning. We show that in the model-based setting we can leverage uncertainty as a novelty signal, compelling exploration to poorly understood areas of the environment. Third, we turn to the fundamental problem of approximate Bayesian inference. We develop a framework for generative particle-based variational inference that allows for efficient sampling, places no restrictions on the approximate posterior, and improves our ability to estimate epistemic uncertainty

    Similar works