1,648 research outputs found
Learning text representation using recurrent convolutional neural network with highway layers
Recently, the rapid development of word embedding and neural networks has
brought new inspiration to various NLP and IR tasks. In this paper, we describe
a staged hybrid model combining Recurrent Convolutional Neural Networks (RCNN)
with highway layers. The highway network module is incorporated in the middle
takes the output of the bi-directional Recurrent Neural Network (Bi-RNN) module
in the first stage and provides the Convolutional Neural Network (CNN) module
in the last stage with the input. The experiment shows that our model
outperforms common neural network models (CNN, RNN, Bi-RNN) on a sentiment
analysis task. Besides, the analysis of how sequence length influences the RCNN
with highway layers shows that our model could learn good representation for
the long text.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieva
Learning text representation using recurrent convolutional neural network with highway layers
Recently, the rapid development of word embedding and
neural networks has brought new inspiration to various NLP
and IR tasks. In this paper, we describe a staged hybrid
model combining Recurrent Convolutional Neural Networks
(RCNN) with highway layers. The highway network module
is incorporated in the middle takes the output of the bidirectional
Recurrent Neural Network (Bi-RNN) module in
the first stage and provides the Convolutional Neural Network
(CNN) module in the last stage with the input. The
experiment shows that our model outperforms common neural
network models (CNN, RNN, Bi-RNN) on a sentiment
analysis task. Besides, the analysis of how sequence length
influences the RCNN with highway layers shows that our
model could learn good representation for the long text
Character-Aware Neural Language Models
We describe a simple neural language model that relies only on
character-level inputs. Predictions are still made at the word-level. Our model
employs a convolutional neural network (CNN) and a highway network over
characters, whose output is given to a long short-term memory (LSTM) recurrent
neural network language model (RNN-LM). On the English Penn Treebank the model
is on par with the existing state-of-the-art despite having 60% fewer
parameters. On languages with rich morphology (Arabic, Czech, French, German,
Spanish, Russian), the model outperforms word-level/morpheme-level LSTM
baselines, again with fewer parameters. The results suggest that on many
languages, character inputs are sufficient for language modeling. Analysis of
word representations obtained from the character composition part of the model
reveals that the model is able to encode, from characters only, both semantic
and orthographic information.Comment: AAAI 201
Character-level Transformer-based Neural Machine Translation
Neural machine translation (NMT) is nowadays commonly applied at the subword
level, using byte-pair encoding. A promising alternative approach focuses on
character-level translation, which simplifies processing pipelines in NMT
considerably. This approach, however, must consider relatively longer
sequences, rendering the training process prohibitively expensive. In this
paper, we discuss a novel, Transformer-based approach, that we compare, both in
speed and in quality to the Transformer at subword and character levels, as
well as previously developed character-level models. We evaluate our models on
4 language pairs from WMT'15: DE-EN, CS-EN, FI-EN and RU-EN. The proposed novel
architecture can be trained on a single GPU and is 34% percent faster than the
character-level Transformer; still, the obtained results are at least on par
with it. In addition, our proposed model outperforms the subword-level model in
FI-EN and shows close results in CS-EN. To stimulate further research in this
area and close the gap with subword-level NMT, we make all our code and models
publicly available
Character-level Intra Attention Network for Natural Language Inference
Natural language inference (NLI) is a central problem in language
understanding. End-to-end artificial neural networks have reached
state-of-the-art performance in NLI field recently.
In this paper, we propose Character-level Intra Attention Network (CIAN) for
the NLI task. In our model, we use the character-level convolutional network to
replace the standard word embedding layer, and we use the intra attention to
capture the intra-sentence semantics. The proposed CIAN model provides improved
results based on a newly published MNLI corpus.Comment: EMNLP Workshop RepEval 2017: The Second Workshop on Evaluating Vector
Space Representations for NL
- …