5 research outputs found

    Assessing the reliability of deep neural networks

    Get PDF
    Deep Neural Networks (DNNs) have achieved astonishing results in the last two decades, fueled by ever larger datasets and the availability of high performance compute hardware. This led to breakthroughs in many applications such as image and speech recognition, natural language processing, autonomous driving, and drug discovery. Despite their success, the understanding of internal workings and the interpretability of predictions remains limited and DNNs are often treated as "black boxes". Especially for safety-critical applications where the well-being of humans is at risk, decisions based on predictions should consider associated uncertainties. Autonomous vehicles, for example, operate in a highly complex environment with potentially unpredictable situations that can lead to safety risks for pedestrians and other road users. In medical applications, decision based on incorrect predictions can have serious consequences for a patient's health. As a consequence, the topic of Uncertainty Quantification (UQ) has received increasing attention in recent years. The goal of UQ is to assign uncertainties to predictions in order to ensure the decision-making process is informed by potentially unreliable predictions. In addition, other tasks such as identifying model weaknesses, collecting additional data or detecting malicious attacks can be supported by uncertainty estimates. Unfortunately, UQ for DNNs is a particularly challenging task due to their high complexity and nonlinearity. Uncertainties which can be derived from traditional statistical models are often not directly applicable to DNNs. Therefore, the development of new UQ techniques for DNNs is of paramount importance to ensure safety-aware decision-making. This thesis evaluates existing UQ methods and proposes improvements and novel approaches which contribute to the reliability and trustworthiness of modern deep learning methodology. One of the core contributions of this work is the development of a novel generative learning framework with an integrated training of a One-vs-All (OvA) classifier. A Generative Adversarial Network (GAN) is trained in such a way that it is possible to sample from the boundary of the training distribution. These boundary samples are shielding the training dataset from the Out-of-Distribution (OoD) region. By making the GAN class-conditional, it is possible to shield each class separately, which integrates well with the formulation of an OvA classifier. The OvA classifier achieves outstanding results on the task of OoD detection and surpasses many previous works by large margins. In addition, the tight class shielding also improves the overall classification accuracy. A comprehensive and consistent evaluation on the tasks of False Positive, Out-of-Distribution and Adversarial Example Detection on a diverse selection of datasets provides insights into the strengths and weaknesses of existing methods and the proposed approaches

    UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs

    Full text link
    We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper uncertainties for OoD examples as well as for false positives (FPs). Instead of shielding the entire in-distribution data with GAN generated OoD examples which is state-of-the-art, we shield each class separately with out-of-class examples generated by a conditional GAN and complement this with a one-vs-all image classifier. In our experiments, in particular on CIFAR10 and CIFAR100, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers. Furthermore, we also find that the generated GAN examples do not significantly affect the calibration error of our classifier and result in a significant gain in model accuracy
    corecore