21 research outputs found

    An explainable Transformer-based deep learning model for the prediction of incident heart failure

    Get PDF
    Predicting the incidence of complex chronic conditions such as heart failure is challenging. Deep learning models applied to rich electronic health records may improve prediction but remain unexplainable hampering their wider use in medical practice. We developed a novel Transformer deep-learning model for more accurate and yet explainable prediction of incident heart failure involving 100,071 patients from longitudinal linked electronic health records across the UK. On internal 5-fold cross validation and held-out external validation, our model achieved 0.93 and 0.93 area under the receiver operator curve and 0.69 and 0.70 area under the precision-recall curve, respectively and outperformed existing deep learning models. Predictor groups included all community and hospital diagnoses and medications contextualised within the age and calendar year for each patient's clinical encounter. The importance of contextualised medical information was revealed in a number of sensitivity analyses, and our perturbation method provided a way of identifying factors contributing to risk. Many of the identified risk factors were consistent with existing knowledge from clinical and epidemiological research but several new associations were revealed which had not been considered in expert-driven risk prediction models

    Train the Neural Network by Abstract Images

    Get PDF
    Like the textbook for students\u27 learning, the training data plays a significant role in the network\u27s training. In most cases, people intend to use big-data to train the network, which leads to two problems. Firstly, the knowledge learned by the network is out of control. Secondly, the space occupation of big-data is huge. In this paper, we use the concepts-based knowledge visualization [33] to visualize the knowledge learned by the model. Based on the observation results and information theory, we make three conjectures about the key information provided by the dataset. Finally, we use experiments to prove that the artificial abstracted data can be used in networks\u27 training, which can solve the problem mentioned above. The experiment is designed based on Mask-RCNN, which is used to detect and classify three typical human poses on the construction site

    Interpreting Multivariate Shapley Interactions in DNNs

    Full text link
    This paper aims to explain deep neural networks (DNNs) from the perspective of multivariate interactions. In this paper, we define and quantify the significance of interactions among multiple input variables of the DNN. Input variables with strong interactions usually form a coalition and reflect prototype features, which are memorized and used by the DNN for inference. We define the significance of interactions based on the Shapley value, which is designed to assign the attribution value of each input variable to the inference. We have conducted experiments with various DNNs. Experimental results have demonstrated the effectiveness of the proposed method

    A Diagnostic Study of Explainability Techniques for Text Classification

    Full text link
    Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural complexity. Efforts to make the rationales behind the models' predictions transparent have inspired an abundance of new explainability techniques. Provided with an already trained model, they compute saliency scores for the words of an input instance. However, there exists no definitive guide on (i) how to choose such a technique given a particular application task and model architecture, and (ii) the benefits and drawbacks of using each such technique. In this paper, we develop a comprehensive list of diagnostic properties for evaluating existing explainability techniques. We then employ the proposed list to compare a set of diverse explainability techniques on downstream text classification tasks and neural network architectures. We also compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones. Overall, we find that the gradient-based explanations perform best across tasks and model architectures, and we present further insights into the properties of the reviewed explainability techniques
    corecore