A Semantic Testing Approach for Deep Neural Networks Using Bayesian Network Abstraction

Abstract

The studies presented in this thesis are directed at investigating the internal decision process of Deep Neural Networks (DNNs) and testing their performance based on feature impor- tance weights. Deep learning models have achieved state-of-the-art performance in a variety of machine learning tasks, which has led to their integration into safety-critical domains such as autonomous vehicles. The susceptibility of deep learning models to adversarial examples raises serious concerns about their application in safety-critical contexts. Most existing testing methodologies have failed to consider the interactions between neurons and the semantic representations formed in the DNN during the training process. This thesis designed weight-based semantic testing metrics that first modelled the internal behaviour of the DNNs into Bayesian networks and the contribution of the hidden features to their decisions into importance weight. Moreover, it measured the test data coverage according to the weight of the features. These approaches were followed to answer the main research question, "Is it a better measure of trustworthiness to measure the coverage of the semantic aspect of deep neural networks and treat each internal component according to its contribu- tion value to the decision when testing these learning models’ performance than relying on traditional structural unweighted measures?". This thesis makes three main contributions to the field of machine learning. First, the thesis proposes a novel technique for estimating the importance of a neural network’s latent features through its abstracted behaviour into a Bayesian Network (BN). The algo- rithm analysed the sensitivity of each extracted feature to distributional shifts by observing changes in BN distribution. The experimental results showed that computing the distance between two BN probability distributions, clean as well as perturbed by interval-shifts or adversarial attacks, can detect the distribution shift wherever it exists. The hidden features were assigned weight scores according to the computed sensitivity distances. Secondly, to further justify the contribution of each latent feature to the classification decision, the ab- stract scheme of the BN was extended to perform a prediction. The performance of the BN in predicting input classification labels was shown to be a decent approximator of the original DNN. Moreover, feature perturbation on the BN classifier demonstrated that each feature influenced prediction accuracy differently, thereby validating the presented feature importance assumption. Lastly, the developed feature importance measure was used to assess the extent to which a given test dataset exercises high-level features that have been learned by hidden layers of the DNN, taking into account significant representations as a priority when generating new test inputs. The evaluation was conducted to compare the initial and final coverage of the proposed weighting approach with normal BN-based feature coverage. The testing coverage experiments indicated that the proposed weight metrics achieved higher coverage compared to the original feature metrics while maintain- ing the effectiveness of finding adversarial samples during the test case generation process. Furthermore, the weight metrics guaranteed that the achieved testing percent covered the most crucial components, where the test generation algorithm was directed to synthesise new input targeting features with higher importance scores. Hence, the evidence of DNNs’ trustworthy behaviour is subsequently furthered through this study

    Similar works