37,737 research outputs found
What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models
Despite the remarkable evolution of deep neural networks in natural language
processing (NLP), their interpretability remains a challenge. Previous work
largely focused on what these models learn at the representation level. We
break this analysis down further and study individual dimensions (neurons) in
the vector representation learned by end-to-end neural models in NLP tasks. We
propose two methods: Linguistic Correlation Analysis, based on a supervised
method to extract the most relevant neurons with respect to an extrinsic task,
and Cross-model Correlation Analysis, an unsupervised method to extract salient
neurons w.r.t. the model itself. We evaluate the effectiveness of our
techniques by ablating the identified neurons and reevaluating the network's
performance for two tasks: neural machine translation (NMT) and neural language
modeling (NLM). We further present a comprehensive analysis of neurons with the
aim to address the following questions: i) how localized or distributed are
different linguistic properties in the models? ii) are certain neurons
exclusive to some properties and not others? iii) is the information more or
less distributed in NMT vs. NLM? and iv) how important are the neurons
identified through the linguistic correlation method to the overall task? Our
code is publicly available as part of the NeuroX toolkit (Dalvi et al. 2019).Comment: AAA 2019, pages 10, AAAI Conference on Artificial Intelligence (AAAI
2019
Deep Short Text Classification with Knowledge Powered Attention
Short text classification is one of important tasks in Natural Language
Processing (NLP). Unlike paragraphs or documents, short texts are more
ambiguous since they have not enough contextual information, which poses a
great challenge for classification. In this paper, we retrieve knowledge from
external knowledge source to enhance the semantic representation of short
texts. We take conceptual information as a kind of knowledge and incorporate it
into deep neural networks. For the purpose of measuring the importance of
knowledge, we introduce attention mechanisms and propose deep Short Text
Classification with Knowledge powered Attention (STCKA). We utilize Concept
towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS)
attention to acquire the weight of concepts from two aspects. And we classify a
short text with the help of conceptual information. Unlike traditional
approaches, our model acts like a human being who has intrinsic ability to make
decisions based on observation (i.e., training data for machines) and pays more
attention to important knowledge. We also conduct extensive experiments on four
public datasets for different tasks. The experimental results and case studies
show that our model outperforms the state-of-the-art methods, justifying the
effectiveness of knowledge powered attention
- …