4,367 research outputs found
On Network Science and Mutual Information for Explaining Deep Neural Networks
In this paper, we present a new approach to interpret deep learning models.
By coupling mutual information with network science, we explore how information
flows through feedforward networks. We show that efficiently approximating
mutual information allows us to create an information measure that quantifies
how much information flows between any two neurons of a deep learning model. To
that end, we propose NIF, Neural Information Flow, a technique for codifying
information flow that exposes deep learning model internals and provides
feature attributions.Comment: ICASSP 2020 (shorter version appeared at AAAI-19 Workshop on Network
Interpretability for Deep Learning
Response Characterization for Auditing Cell Dynamics in Long Short-term Memory Networks
In this paper, we introduce a novel method to interpret recurrent neural
networks (RNNs), particularly long short-term memory networks (LSTMs) at the
cellular level. We propose a systematic pipeline for interpreting individual
hidden state dynamics within the network using response characterization
methods. The ranked contribution of individual cells to the network's output is
computed by analyzing a set of interpretable metrics of their decoupled step
and sinusoidal responses. As a result, our method is able to uniquely identify
neurons with insightful dynamics, quantify relationships between dynamical
properties and test accuracy through ablation analysis, and interpret the
impact of network capacity on a network's dynamical distribution. Finally, we
demonstrate generalizability and scalability of our method by evaluating a
series of different benchmark sequential datasets
- …