1,689 research outputs found
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases
Keyphrase extraction from documents is useful to a variety of applications
such as information retrieval and document summarization. This paper presents
an end-to-end method called DivGraphPointer for extracting a set of diversified
keyphrases from a document. DivGraphPointer combines the advantages of
traditional graph-based ranking methods and recent neural network-based
approaches. Specifically, given a document, a word graph is constructed from
the document based on word proximity and is encoded with graph convolutional
networks, which effectively capture document-level word salience by modeling
long-range dependency between words in the document and aggregating multiple
appearances of identical words into one node. Furthermore, we propose a
diversified point network to generate a set of diverse keyphrases out of the
word graph in the decoding process. Experimental results on five benchmark data
sets show that our proposed method significantly outperforms the existing
state-of-the-art approaches.Comment: Accepted to SIGIR 201
A Practitioners' Guide to Transfer Learning for Text Classification using Convolutional Neural Networks
Transfer Learning (TL) plays a crucial role when a given dataset has
insufficient labeled examples to train an accurate model. In such scenarios,
the knowledge accumulated within a model pre-trained on a source dataset can be
transferred to a target dataset, resulting in the improvement of the target
model. Though TL is found to be successful in the realm of image-based
applications, its impact and practical use in Natural Language Processing (NLP)
applications is still a subject of research. Due to their hierarchical
architecture, Deep Neural Networks (DNN) provide flexibility and customization
in adjusting their parameters and depth of layers, thereby forming an apt area
for exploiting the use of TL. In this paper, we report the results and
conclusions obtained from extensive empirical experiments using a Convolutional
Neural Network (CNN) and try to uncover thumb rules to ensure a successful
positive transfer. In addition, we also highlight the flawed means that could
lead to a negative transfer. We explore the transferability of various layers
and describe the effect of varying hyper-parameters on the transfer
performance. Also, we present a comparison of accuracy value and model size
against state-of-the-art methods. Finally, we derive inferences from the
empirical results and provide best practices to achieve a successful positive
transfer.Comment: 9 pages, 2 figures, accepted in SDM 201
A Dependency-Based Neural Network for Relation Classification
Previous research on relation classification has verified the effectiveness
of using dependency shortest paths or subtrees. In this paper, we further
explore how to make full use of the combination of these dependency
information. We first propose a new structure, termed augmented dependency path
(ADP), which is composed of the shortest dependency path between two entities
and the subtrees attached to the shortest path. To exploit the semantic
representation behind the ADP structure, we develop dependency-based neural
networks (DepNN): a recursive neural network designed to model the subtrees,
and a convolutional neural network to capture the most important features on
the shortest path. Experiments on the SemEval-2010 dataset show that our
proposed method achieves state-of-art results.Comment: This preprint is the full version of a short paper accepted in the
annual meeting of the Association for Computational Linguistics (ACL) 2015
(Beijing, China
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Because of their superior ability to preserve sequence information over time,
Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with
a more complex computational unit, have obtained strong results on a variety of
sequence modeling tasks. The only underlying LSTM structure that has been
explored so far is a linear chain. However, natural language exhibits syntactic
properties that would naturally combine words to phrases. We introduce the
Tree-LSTM, a generalization of LSTMs to tree-structured network topologies.
Tree-LSTMs outperform all existing systems and strong LSTM baselines on two
tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task
1) and sentiment classification (Stanford Sentiment Treebank).Comment: Accepted for publication at ACL 201
- …