16,343 research outputs found
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
When and where do feed-forward neural networks learn localist representations?
According to parallel distributed processing (PDP) theory in psychology,
neural networks (NN) learn distributed rather than interpretable localist
representations. This view has been held so strongly that few researchers have
analysed single units to determine if this assumption is correct. However,
recent results from psychology, neuroscience and computer science have shown
the occasional existence of local codes emerging in artificial and biological
neural networks. In this paper, we undertake the first systematic survey of
when local codes emerge in a feed-forward neural network, using generated input
and output data with known qualities. We find that the number of local codes
that emerge from a NN follows a well-defined distribution across the number of
hidden layer neurons, with a peak determined by the size of input data, number
of examples presented and the sparsity of input data. Using a 1-hot output code
drastically decreases the number of local codes on the hidden layer. The number
of emergent local codes increases with the percentage of dropout applied to the
hidden layer, suggesting that the localist encoding may offer a resilience to
noisy networks. This data suggests that localist coding can emerge from
feed-forward PDP networks and suggests some of the conditions that may lead to
interpretable localist representations in the cortex. The findings highlight
how local codes should not be dismissed out of hand
Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Visual question answering requires high-order reasoning about an image, which
is a fundamental capability needed by machine systems to follow complex
directives. Recently, modular networks have been shown to be an effective
framework for performing visual reasoning tasks. While modular networks were
initially designed with a degree of model transparency, their performance on
complex visual reasoning benchmarks was lacking. Current state-of-the-art
approaches do not provide an effective mechanism for understanding the
reasoning process. In this paper, we close the performance gap between
interpretable models and state-of-the-art visual reasoning methods. We propose
a set of visual-reasoning primitives which, when composed, manifest as a model
capable of performing complex reasoning tasks in an explicitly-interpretable
manner. The fidelity and interpretability of the primitives' outputs enable an
unparalleled ability to diagnose the strengths and weaknesses of the resulting
model. Critically, we show that these primitives are highly performant,
achieving state-of-the-art accuracy of 99.1% on the CLEVR dataset. We also show
that our model is able to effectively learn generalized representations when
provided a small amount of data containing novel object attributes. Using the
CoGenT generalization task, we show more than a 20 percentage point improvement
over the current state of the art.Comment: CVPR 2018 pre-prin
Question-Answering with Grammatically-Interpretable Representations
We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In
our application of TPRN, internal representations learned by end-to-end
optimization in a deep neural network performing a textual question-answering
(QA) task can be interpreted using basic concepts from linguistic theory. No
performance penalty need be paid for this increased interpretability: the
proposed model performs comparably to a state-of-the-art system on the SQuAD QA
task. The internal representation which is interpreted is a Tensor Product
Representation: for each input word, the model selects a symbol to encode the
word, and a role in which to place the symbol, and binds the two together. The
selection is via soft attention. The overall interpretation is built from
interpretations of the symbols, as recruited by the trained model, and
interpretations of the roles as used by the model. We find support for our
initial hypothesis that symbols can be interpreted as lexical-semantic word
meanings, while roles can be interpreted as approximations of grammatical roles
(or categories) such as subject, wh-word, determiner, etc. Fine-grained
analysis reveals specific correspondences between the learned roles and parts
of speech as assigned by a standard tagger (Toutanova et al. 2003), and finds
several discrepancies in the model's favor. In this sense, the model learns
significant aspects of grammar, after having been exposed solely to
linguistically unannotated text, questions, and answers: no prior linguistic
knowledge is given to the model. What is given is the means to build
representations using symbols and roles, with an inductive bias favoring use of
these in an approximately discrete manner
- …