Assessing the reliability of deep neural networks

Abstract

Deep Neural Networks (DNNs) have achieved astonishing results in the last two decades, fueled by ever larger datasets and the availability of high performance compute hardware. This led to breakthroughs in many applications such as image and speech recognition, natural language processing, autonomous driving, and drug discovery. Despite their success, the understanding of internal workings and the interpretability of predictions remains limited and DNNs are often treated as "black boxes". Especially for safety-critical applications where the well-being of humans is at risk, decisions based on predictions should consider associated uncertainties. Autonomous vehicles, for example, operate in a highly complex environment with potentially unpredictable situations that can lead to safety risks for pedestrians and other road users. In medical applications, decision based on incorrect predictions can have serious consequences for a patient's health. As a consequence, the topic of Uncertainty Quantification (UQ) has received increasing attention in recent years. The goal of UQ is to assign uncertainties to predictions in order to ensure the decision-making process is informed by potentially unreliable predictions. In addition, other tasks such as identifying model weaknesses, collecting additional data or detecting malicious attacks can be supported by uncertainty estimates. Unfortunately, UQ for DNNs is a particularly challenging task due to their high complexity and nonlinearity. Uncertainties which can be derived from traditional statistical models are often not directly applicable to DNNs. Therefore, the development of new UQ techniques for DNNs is of paramount importance to ensure safety-aware decision-making. This thesis evaluates existing UQ methods and proposes improvements and novel approaches which contribute to the reliability and trustworthiness of modern deep learning methodology. One of the core contributions of this work is the development of a novel generative learning framework with an integrated training of a One-vs-All (OvA) classifier. A Generative Adversarial Network (GAN) is trained in such a way that it is possible to sample from the boundary of the training distribution. These boundary samples are shielding the training dataset from the Out-of-Distribution (OoD) region. By making the GAN class-conditional, it is possible to shield each class separately, which integrates well with the formulation of an OvA classifier. The OvA classifier achieves outstanding results on the task of OoD detection and surpasses many previous works by large margins. In addition, the tight class shielding also improves the overall classification accuracy. A comprehensive and consistent evaluation on the tasks of False Positive, Out-of-Distribution and Adversarial Example Detection on a diverse selection of datasets provides insights into the strengths and weaknesses of existing methods and the proposed approaches

    Similar works