3,339 research outputs found
Neural Networks Compression for Language Modeling
In this paper, we consider several compression techniques for the language
modeling problem based on recurrent neural networks (RNNs). It is known that
conventional RNNs, e.g, LSTM-based networks in language modeling, are
characterized with either high space complexity or substantial inference time.
This problem is especially crucial for mobile applications, in which the
constant interaction with the remote server is inappropriate. By using the Penn
Treebank (PTB) dataset we compare pruning, quantization, low-rank
factorization, tensor train decomposition for LSTM networks in terms of model
size and suitability for fast inference.Comment: Keywords: LSTM, RNN, language modeling, low-rank factorization,
pruning, quantization. Published by Springer in the LNCS series, 7th
International Conference on Pattern Recognition and Machine Intelligence,
201
Attentive Tensor Product Learning
This paper proposes a new architecture - Attentive Tensor Product Learning
(ATPL) - to represent grammatical structures in deep learning models. ATPL is a
new architecture to bridge this gap by exploiting Tensor Product
Representations (TPR), a structured neural-symbolic model developed in
cognitive science, aiming to integrate deep learning with explicit language
structures and rules. The key ideas of ATPL are: 1) unsupervised learning of
role-unbinding vectors of words via TPR-based deep neural network; 2) employing
attention modules to compute TPR; and 3) integration of TPR with typical deep
learning architectures including Long Short-Term Memory (LSTM) and Feedforward
Neural Network (FFNN). The novelty of our approach lies in its ability to
extract the grammatical structure of a sentence by using role-unbinding
vectors, which are obtained in an unsupervised manner. This ATPL approach is
applied to 1) image captioning, 2) part of speech (POS) tagging, and 3)
constituency parsing of a sentence. Experimental results demonstrate the
effectiveness of the proposed approach
- …