4,839 research outputs found
Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification
Convolution kernels support the modeling of complex syntactic information in machine-learning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus
Dependency-based Convolutional Neural Networks for Sentence Embedding
In sentence modeling and classification, convolutional neural network
approaches have recently achieved state-of-the-art results, but all such
efforts process word vectors sequentially and neglect long-distance
dependencies. To exploit both deep learning and linguistic structures, we
propose a tree-based convolutional neural network model which exploit various
long-distance relationships between words. Our model improves the sequential
baselines on all three sentiment and question classification tasks, and
achieves the highest published accuracy on TREC.Comment: this paper has been accepted by ACL 201
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Because of their superior ability to preserve sequence information over time,
Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with
a more complex computational unit, have obtained strong results on a variety of
sequence modeling tasks. The only underlying LSTM structure that has been
explored so far is a linear chain. However, natural language exhibits syntactic
properties that would naturally combine words to phrases. We introduce the
Tree-LSTM, a generalization of LSTMs to tree-structured network topologies.
Tree-LSTMs outperform all existing systems and strong LSTM baselines on two
tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task
1) and sentiment classification (Stanford Sentiment Treebank).Comment: Accepted for publication at ACL 201
When Are Tree Structures Necessary for Deep Learning of Representations?
Recursive neural models, which use syntactic parse trees to recursively
generate representations bottom-up, are a popular architecture. But there have
not been rigorous evaluations showing for exactly which tasks this syntax-based
method is appropriate. In this paper we benchmark {\bf recursive} neural models
against sequential {\bf recurrent} neural models (simple recurrent and LSTM
models), enforcing apples-to-apples comparison as much as possible. We
investigate 4 tasks: (1) sentiment classification at the sentence level and
phrase level; (2) matching questions to answer-phrases; (3) discourse parsing;
(4) semantic relation extraction (e.g., {\em component-whole} between nouns).
Our goal is to understand better when, and why, recursive models can
outperform simpler models. We find that recursive models help mainly on tasks
(like semantic relation extraction) that require associating headwords across a
long distance, particularly on very long sequences. We then introduce a method
for allowing recurrent models to achieve similar performance: breaking long
sentences into clause-like units at punctuation and processing them separately
before combining. Our results thus help understand the limitations of both
classes of models, and suggest directions for improving recurrent models
Learning to Identify Ambiguous and Misleading News Headlines
Accuracy is one of the basic principles of journalism. However, it is
increasingly hard to manage due to the diversity of news media. Some editors of
online news tend to use catchy headlines which trick readers into clicking.
These headlines are either ambiguous or misleading, degrading the reading
experience of the audience. Thus, identifying inaccurate news headlines is a
task worth studying. Previous work names these headlines "clickbaits" and
mainly focus on the features extracted from the headlines, which limits the
performance since the consistency between headlines and news bodies is
underappreciated. In this paper, we clearly redefine the problem and identify
ambiguous and misleading headlines separately. We utilize class sequential
rules to exploit structure information when detecting ambiguous headlines. For
the identification of misleading headlines, we extract features based on the
congruence between headlines and bodies. To make use of the large unlabeled
data set, we apply a co-training method and gain an increase in performance.
The experiment results show the effectiveness of our methods. Then we use our
classifiers to detect inaccurate headlines crawled from different sources and
conduct a data analysis.Comment: Accepted by IJCAI 201
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
- …