Search CORE

1,524 research outputs found

Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition

Author: Bai Kun
Pan Yu
Wang Fei
Wang Maolin
Xu Jing
Xu Zenglin
Ye Jinmian
Publication venue
Publication date: 19/11/2018
Field of study

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts. However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost. This makes the training of RNNs unscalable and difficult. To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation. Compared with other tensor decomposition methods, TR-LSTM is more stable. In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data. Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor train LSTM and other state-of-the-art competitors.Comment: 9 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Neural Networks Compression for Language Modeling

Author: F Jelinek
IV Oseledets
R Kneser
S Hochreiter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/08/2017
Field of study

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial for mobile applications, in which the constant interaction with the remote server is inappropriate. By using the Penn Treebank (PTB) dataset we compare pruning, quantization, low-rank factorization, tensor train decomposition for LSTM networks in terms of model size and suitability for fast inference.Comment: Keywords: LSTM, RNN, language modeling, low-rank factorization, pruning, quantization. Published by Springer in the LNCS series, 7th International Conference on Pattern Recognition and Machine Intelligence, 201

arXiv.org e-Print Archive

Crossref

Tensorizing Neural Networks

Author: Novikov Alexander
Osokin Anton
Podoprikhin Dmitry
Vetrov Dmitry
Publication venue
Publication date: 07/12/2015
Field of study

Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved. In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server