7,842 research outputs found
Recurrent Memory Networks for Language Modeling
Recurrent Neural Networks (RNN) have obtained excellent result in many
natural language processing (NLP) tasks. However, understanding and
interpreting the source of this success remains a challenge. In this paper, we
propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only
amplifies the power of RNN but also facilitates our understanding of its
internal functioning and allows us to discover underlying patterns in data. We
demonstrate the power of RMN on language modeling and sentence completion
tasks. On language modeling, RMN outperforms Long Short-Term Memory (LSTM)
network on three large German, Italian, and English dataset. Additionally we
perform in-depth analysis of various linguistic dimensions that RMN captures.
On Sentence Completion Challenge, for which it is essential to capture sentence
coherence, our RMN obtains 69.2% accuracy, surpassing the previous
state-of-the-art by a large margin.Comment: 8 pages, 6 figures. Accepted at NAACL 201
Self-Adaptive Hierarchical Sentence Model
The ability to accurately model a sentence at varying stages (e.g.,
word-phrase-sentence) plays a central role in natural language processing. As
an effort towards this goal we propose a self-adaptive hierarchical sentence
model (AdaSent). AdaSent effectively forms a hierarchy of representations from
words to phrases and then to sentences through recursive gated local
composition of adjacent segments. We design a competitive mechanism (through
gating networks) to allow the representations of the same sentence to be
engaged in a particular learning task (e.g., classification), therefore
effectively mitigating the gradient vanishing problem persistent in other
recursive models. Both qualitative and quantitative analysis shows that AdaSent
can automatically form and select the representations suitable for the task at
hand during training, yielding superior classification performance over
competitor models on 5 benchmark data sets.Comment: 8 pages, 7 figures, accepted as a full paper at IJCAI 201
Gated Recurrent Neural Tensor Network
Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling
temporal and sequential data need to capture long-term dependencies on datasets
and represent them in hidden layers with a powerful model to capture more
information from inputs. For modeling long-term dependencies in a dataset, the
gating mechanism concept can help RNNs remember and forget previous
information. Representing the hidden layers of an RNN with more expressive
operations (i.e., tensor products) helps it learn a more complex relationship
between the current input and the previous hidden layer information. These
ideas can generally improve RNN performances. In this paper, we proposed a
novel RNN architecture that combine the concepts of gating mechanism and the
tensor product into a single model. By combining these two concepts into a
single RNN, our proposed models learn long-term dependencies by modeling with
gating units and obtain more expressive and direct interaction between input
and hidden layers using a tensor product on 3-dimensional array (tensor) weight
parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit
(GRU) RNN and combine them with a tensor product inside their formulations. Our
proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural
Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor
Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the
tensor product. We conducted experiments with our proposed models on word-level
and character-level language modeling tasks and revealed that our proposed
models significantly improved their performance compared to our baseline
models.Comment: Accepted at IJCNN 2016 URL :
http://ieeexplore.ieee.org/document/7727233
- …