3,866 research outputs found

    Interpreting Neural Networks With Nearest Neighbors

    Full text link
    Local model interpretation methods explain individual predictions by assigning an importance value to each input feature. This value is often determined by measuring the change in confidence when a feature is removed. However, the confidence of neural networks is not a robust measure of model uncertainty. This issue makes reliably judging the importance of the input features difficult. We address this by changing the test-time behavior of neural networks using Deep k-Nearest Neighbors. Without harming text classification accuracy, this algorithm provides a more robust uncertainty metric which we use to generate feature importance values. The resulting interpretations better align with human perception than baseline methods. Finally, we use our interpretation method to analyze model predictions on dataset annotation artifacts.Comment: EMNLP 2018 BlackboxNL

    When Explanations Lie: Why Many Modified BP Attributions Fail

    Full text link
    Attribution methods aim to explain a neural network's prediction by highlighting the most relevant image areas. A popular approach is to backpropagate (BP) a custom relevance score using modified rules, rather than the gradient. We analyze an extensive set of modified BP methods: Deep Taylor Decomposition, Layer-wise Relevance Propagation (LRP), Excitation BP, PatternAttribution, DeepLIFT, Deconv, RectGrad, and Guided BP. We find empirically that the explanations of all mentioned methods, except for DeepLIFT, are independent of the parameters of later layers. We provide theoretical insights for this surprising behavior and also analyze why DeepLIFT does not suffer from this limitation. Empirically, we measure how information of later layers is ignored by using our new metric, cosine similarity convergence (CSC). The paper provides a framework to assess the faithfulness of new and existing modified BP methods theoretically and empirically. For code see: https://github.com/berleon/when-explanations-lieComment: Published in ICML 2020, Camera Ready Versio

    Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

    Full text link
    Explaining the output of a complicated machine learning model like a deep neural network (DNN) is a central challenge in machine learning. Several proposed local explanation methods address this issue by identifying what dimensions of a single input are most responsible for a DNN's output. The goal of this work is to assess the sensitivity of local explanations to DNN parameter values. Somewhat surprisingly, we find that DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Our conjecture is that this phenomenon occurs because these explanations are dominated by the lower level features of a DNN, and that a DNN's architecture provides a strong prior which significantly affects the representations learned at these lower layers. NOTE: This work is now subsumed by our recent manuscript, Sanity Checks for Saliency Maps (to appear NIPS 2018), where we expand on findings and address concerns raised in Sundararajan et. al. (2018).Comment: Workshop Track International Conference on Learning Representations (ICLR

    A Categorisation of Post-hoc Explanations for Predictive Models

    Full text link
    The ubiquity of machine learning based predictive models in modern society naturally leads people to ask how trustworthy those models are? In predictive modeling, it is quite common to induce a trade-off between accuracy and interpretability. For instance, doctors would like to know how effective some treatment will be for a patient or why the model suggested a particular medication for a patient exhibiting those symptoms? We acknowledge that the necessity for interpretability is a consequence of an incomplete formalisation of the problem, or more precisely of multiple meanings adhered to a particular concept. For certain problems, it is not enough to get the answer (what), the model also has to provide an explanation of how it came to that conclusion (why), because a correct prediction, only partially solves the original problem. In this article we extend existing categorisation of techniques to aid model interpretability and test this categorisation.Comment: 5 pages, 3 figures, AAAI 2019 Spring Symposia (#SSS19

    Towards Prediction Explainability through Sparse Communication

    Full text link
    Explainability is a topic of growing importance in NLP. In this work, we provide a unified perspective of explainability as a communication problem between an explainer and a layperson about a classifier's decision. We use this framework to compare several prior approaches for extracting explanations, including gradient methods, representation erasure, and attention mechanisms, in terms of their communication success. In addition, we reinterpret these methods at the light of classical feature selection, and we use this as inspiration to propose new embedded methods for explainability, through the use of selective, sparse attention. Experiments in text classification, natural language entailment, and machine translation, using different configurations of explainers and laypeople (including both machines and humans), reveal an advantage of attention-based explainers over gradient and erasure methods. Furthermore, human evaluation experiments show promising results with post-hoc explainers trained to optimize communication success and faithfulness

    Explaining a black-box using Deep Variational Information Bottleneck Approach

    Full text link
    Interpretable machine learning has gained much attention recently. Briefness and comprehensiveness are necessary in order to provide a large amount of information concisely when explaining a black-box decision system. However, existing interpretable machine learning methods fail to consider briefness and comprehensiveness simultaneously, leading to redundant explanations. We propose the variational information bottleneck for interpretation, VIBI, a system-agnostic interpretable method that provides a brief but comprehensive explanation. VIBI adopts an information theoretic principle, information bottleneck principle, as a criterion for finding such explanations. For each instance, VIBI selects key features that are maximally compressed about an input (briefness), and informative about a decision made by a black-box system on that input (comprehensive). We evaluate VIBI on three datasets and compare with state-of-the-art interpretable machine learning methods in terms of both interpretability and fidelity evaluated by human and quantitative metric

    Detecting and interpreting myocardial infarction using fully convolutional neural networks

    Full text link
    Objective: We aim to provide an algorithm for the detection of myocardial infarction that operates directly on ECG data without any preprocessing and to investigate its decision criteria. Approach: We train an ensemble of fully convolutional neural networks on the PTB ECG dataset and apply state-of-the-art attribution methods. Main results: Our classifier reaches 93.3% sensitivity and 89.7% specificity evaluated using 10-fold cross-validation with sampling based on patients. The presented method outperforms state-of-the-art approaches and reaches the performance level of human cardiologists for detection of myocardial infarction. We are able to discriminate channel-specific regions that contribute most significantly to the neural network's decision. Interestingly, the network's decision is influenced by signs also recognized by human cardiologists as indicative of myocardial infarction. Significance: Our results demonstrate the high prospects of algorithmic ECG analysis for future clinical applications considering both its quantitative performance as well as the possibility of assessing decision criteria on a per-example basis, which enhances the comprehensibility of the approach.Comment: 11 pages, 4 figure

    Evaluating Recurrent Neural Network Explanations

    Full text link
    Recently, several methods have been proposed to explain the predictions of recurrent neural networks (RNNs), in particular of LSTMs. The goal of these methods is to understand the network's decisions by assigning to each input variable, e.g., a word, a relevance indicating to which extent it contributed to a particular prediction. In previous works, some of these methods were not yet compared to one another, or were evaluated only qualitatively. We close this gap by systematically and quantitatively comparing these methods in different settings, namely (1) a toy arithmetic task which we use as a sanity check, (2) a five-class sentiment prediction of movie reviews, and besides (3) we explore the usefulness of word relevances to build sentence-level representations. Lastly, using the method that performed best in our experiments, we show how specific linguistic phenomena such as the negation in sentiment analysis reflect in terms of relevance patterns, and how the relevance visualization can help to understand the misclassification of individual samples.Comment: 14 pages, accepted for ACL'19 Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NL

    An Empirical Study towards Understanding How Deep Convolutional Nets Recognize Falls

    Full text link
    Detecting unintended falls is essential for ambient intelligence and healthcare of elderly people living alone. In recent years, deep convolutional nets are widely used in human action analysis, based on which a number of fall detection methods have been proposed. Despite their highly effective performances, the behaviors of how the convolutional nets recognize falls are still not clear. In this paper, instead of proposing a novel approach, we perform a systematical empirical study, attempting to investigate the underlying fall recognition process. We propose four tasks to investigate, which involve five types of input modalities, seven net instances and different training samples. The obtained quantitative and qualitative results reveal the patterns that the nets tend to learn, and several factors that can heavily influence the performances on fall recognition. We expect that our conclusions are favorable to proposing better deep learning solutions to fall detection systems.Comment: published at the sixth International Workshop on Assistive Computer Vision and Robotics (ACVR), in conjunction with European Conference on Computer Vision (ECCV), Munich, 201

    HDLTex: Hierarchical Deep Learning for Text Classification

    Full text link
    The continually increasing number of documents produced each year necessitates ever improving information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document classification, which has become an important application for supervised learning. Recently the performance of these traditional classifiers has degraded as the number of documents has increased. This is because along with this growth in the number of documents has come an increase in the number of categories. This paper approaches this problem differently from current document classification methods that view the problem as multi-class classification. Instead we perform hierarchical classification using an approach we call Hierarchical Deep Learning for Text classification (HDLTex). HDLTex employs stacks of deep learning architectures to provide specialized understanding at each level of the document hierarchy.Comment: ICMLA 201
    • …
    corecore