1,769 research outputs found
Structural Attention Neural Networks for improved sentiment analysis
We introduce a tree-structured attention neural network for sentences and
small phrases and apply it to the problem of sentiment classification. Our
model expands the current recursive models by incorporating structural
information around a node of a syntactic tree using both bottom-up and top-down
information propagation. Also, the model utilizes structural attention to
identify the most salient representations during the construction of the
syntactic tree. To our knowledge, the proposed models achieve state of the art
performance on the Stanford Sentiment Treebank dataset.Comment: Submitted to EACL2017 for revie
Learning text representation using recurrent convolutional neural network with highway layers
Recently, the rapid development of word embedding and neural networks has
brought new inspiration to various NLP and IR tasks. In this paper, we describe
a staged hybrid model combining Recurrent Convolutional Neural Networks (RCNN)
with highway layers. The highway network module is incorporated in the middle
takes the output of the bi-directional Recurrent Neural Network (Bi-RNN) module
in the first stage and provides the Convolutional Neural Network (CNN) module
in the last stage with the input. The experiment shows that our model
outperforms common neural network models (CNN, RNN, Bi-RNN) on a sentiment
analysis task. Besides, the analysis of how sequence length influences the RCNN
with highway layers shows that our model could learn good representation for
the long text.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieva
Simple Recurrent Units for Highly Parallelizable Recurrence
Common recurrent neural architectures scale poorly due to the intrinsic
difficulty in parallelizing their state computations. In this work, we propose
the Simple Recurrent Unit (SRU), a light recurrent unit that balances model
capacity and scalability. SRU is designed to provide expressive recurrence,
enable highly parallelized implementation, and comes with careful
initialization to facilitate training of deep models. We demonstrate the
effectiveness of SRU on multiple NLP tasks. SRU achieves 5--9x speed-up over
cuDNN-optimized LSTM on classification and question answering datasets, and
delivers stronger results than LSTM and convolutional models. We also obtain an
average of 0.7 BLEU improvement over the Transformer model on translation by
incorporating SRU into the architecture.Comment: EMNL
Attentional Encoder Network for Targeted Sentiment Classification
Targeted sentiment classification aims at determining the sentimental
tendency towards specific targets. Most of the previous approaches model
context and target words with RNN and attention. However, RNNs are difficult to
parallelize and truncated backpropagation through time brings difficulty in
remembering long-term patterns. To address this issue, this paper proposes an
Attentional Encoder Network (AEN) which eschews recurrence and employs
attention based encoders for the modeling between context and target. We raise
the label unreliability issue and introduce label smoothing regularization. We
also apply pre-trained BERT to this task and obtain new state-of-the-art
results. Experiments and analysis demonstrate the effectiveness and lightweight
of our model.Comment: 7 page
- …