15,185 research outputs found
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
Interpreting Deep Visual Representations via Network Dissection
The success of recent deep convolutional neural networks (CNNs) depends on
learning hidden representations that can summarize the important factors of
variation behind the data. However, CNNs often criticized as being black boxes
that lack interpretability, since they have millions of unexplained model
parameters. In this work, we describe Network Dissection, a method that
interprets networks by providing labels for the units of their deep visual
representations. The proposed method quantifies the interpretability of CNN
representations by evaluating the alignment between individual hidden units and
a set of visual semantic concepts. By identifying the best alignments, units
are given human interpretable labels across a range of objects, parts, scenes,
textures, materials, and colors. The method reveals that deep representations
are more transparent and interpretable than expected: we find that
representations are significantly more interpretable than they would be under a
random equivalently powerful basis. We apply the method to interpret and
compare the latent representations of various network architectures trained to
solve different supervised and self-supervised training tasks. We then examine
factors affecting the network interpretability such as the number of the
training iterations, regularizations, different initializations, and the
network depth and width. Finally we show that the interpreted units can be used
to provide explicit explanations of a prediction given by a CNN for an image.
Our results highlight that interpretability is an important property of deep
neural networks that provides new insights into their hierarchical structure.Comment: *B. Zhou and D. Bau contributed equally to this work. 15 pages, 27
figure
A Hybrid End-to-End Spatio-Temporal Attention Neural Network with Graph-Smooth Signals for EEG Emotion Recognition
Recently, physiological data such as electroencephalography (EEG) signals
have attracted significant attention in affective computing. In this context,
the main goal is to design an automated model that can assess emotional states.
Lately, deep neural networks have shown promising performance in emotion
recognition tasks. However, designing a deep architecture that can extract
practical information from raw data is still a challenge. Here, we introduce a
deep neural network that acquires interpretable physiological representations
by a hybrid structure of spatio-temporal encoding and recurrent attention
network blocks. Furthermore, a preprocessing step is applied to the raw data
using graph signal processing tools to perform graph smoothing in the spatial
domain. We demonstrate that our proposed architecture exceeds state-of-the-art
results for emotion classification on the publicly available DEAP dataset. To
explore the generality of the learned model, we also evaluate the performance
of our architecture towards transfer learning (TL) by transferring the model
parameters from a specific source to other target domains. Using DEAP as the
source dataset, we demonstrate the effectiveness of our model in performing
cross-modality TL and improving emotion classification accuracy on DREAMER and
the Emotional English Word (EEWD) datasets, which involve EEG-based emotion
classification tasks with different stimuli
- …