Search CORE

13,832 research outputs found

Neural Networks Compression for Language Modeling

Author: F Jelinek
IV Oseledets
R Kneser
S Hochreiter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/08/2017
Field of study

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial for mobile applications, in which the constant interaction with the remote server is inappropriate. By using the Penn Treebank (PTB) dataset we compare pruning, quantization, low-rank factorization, tensor train decomposition for LSTM networks in terms of model size and suitability for fast inference.Comment: Keywords: LSTM, RNN, language modeling, low-rank factorization, pruning, quantization. Published by Springer in the LNCS series, 7th International Conference on Pattern Recognition and Machine Intelligence, 201

arXiv.org e-Print Archive

Compression of Recurrent Neural Networks for Efficient Language Modeling

Author: Grachev Artem M.
Ignatov Dmitry I.
Savchenko Andrey V.
Publication venue: 'Elsevier BV'
Publication date: 06/02/2019
Field of study

Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression-perplexity balance are obtained by matrix decomposition techniques.Comment: 25 pages, 3 tables, 4 figure

arXiv.org e-Print Archive

Tensor-Train Recurrent Neural Networks for Video Classification

Author: Krompass Denis
Tresp Volker
Yang Yinchong
Publication venue
Publication date: 06/07/2017
Field of study

The Recurrent Neural Networks and their variants have shown promising performances in sequence modeling tasks such as Natural Language Processing. These models, however, turn out to be impractical and difficult to train when exposed to very high-dimensional inputs due to the large input-to-hidden weight matrix. This may have prevented RNNs' large-scale application in tasks that involve very high input dimensions such as video modeling; current approaches reduce the input dimensions using various feature extractors. To address this challenge, we propose a new, more general and efficient approach by factorizing the input-to-hidden weight matrix using Tensor-Train decomposition which is trained simultaneously with the weights themselves. We test our model on classification tasks using multiple real-world video datasets and achieve competitive performances with state-of-the-art models, even though our model architecture is orders of magnitude less complex. We believe that the proposed approach provides a novel and fundamental building block for modeling high-dimensional sequential data with RNN architectures and opens up many possibilities to transfer the expressive and advanced architectures from other domains such as NLP to modeling high-dimensional sequential data

arXiv.org e-Print Archive

Statistical Machine Translation Features with Multitask Tensor Networks

Author: Devlin Jacob
Huang Zhongqiang
Lamar Thomas
Makhoul John
Schwartz Richard
Setiawan Hendra
Zbib Rabih
Publication venue
Publication date: 01/01/2015
Field of study

We present a three-pronged approach to improving Statistical Machine Translation (SMT), building on recent success in the application of neural networks to SMT. First, we propose new features based on neural networks to model various non-local translation phenomena. Second, we augment the architecture of the neural network with tensor layers that capture important higher-order interaction among the network units. Third, we apply multitask learning to estimate the neural network parameters jointly. Each of our proposed methods results in significant improvements that are complementary. The overall improvement is +2.7 and +1.8 BLEU points for Arabic-English and Chinese-English translation over a state-of-the-art system that already includes neural network features.Comment: 11 pages (9 content + 2 references), 2 figures, accepted to ACL 2015 as a long pape

arXiv.org e-Print Archive

A Quantum Many-body Wave Function Inspired Language Modeling Approach

Author: Blacoe William
Hitchcock Frank L
Hu Baotian
Kim Yoon
Kingma Diederik P
Mikolov Tomas
Sordoni Alessandro
Wan Shengxian
Wang Mengqiu
Yin Wenpeng
Zhang Peng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2018
Field of study

The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory. The latest development for a more effective QLM has adopted word embeddings as a kind of global dependency information and integrated the quantum-inspired idea in a neural network architecture. While these quantum-inspired LMs are theoretically more general and also practically effective, they have two major limitations. First, they have not taken into account the interaction among words with multiple meanings, which is common and important in understanding natural language text. Second, the integration of the quantum-inspired LM with the neural network was mainly for effective training of parameters, yet lacking a theoretical foundation accounting for such integration. To address these two issues, in this paper, we propose a Quantum Many-body Wave Function (QMWF) inspired language modeling approach. The QMWF inspired LM can adopt the tensor product to model the aforesaid interaction among words. It also enables us to reveal the inherent necessity of using Convolutional Neural Network (CNN) in QMWF language modeling. Furthermore, our approach delivers a simple algorithm to represent and match text/sentence pairs. Systematic evaluation shows the effectiveness of the proposed QMWF-LM algorithm, in comparison with the state of the art quantum-inspired LMs and a couple of CNN-based methods, on three typical Question Answering (QA) datasets.Comment: 10 pages,4 figures,CIK

arXiv.org e-Print Archive

A Tensor Based Data Model for Polystore: An Application to Social Networks Data

Author: Leclercq Eric
Savonnet Marinette
Publication venue
Publication date: 26/06/2018
Field of study

In this article, we show how the mathematical object tensor can be used to build a multi-paradigm model for the storage of social data in data warehouses. From an architectural point of view, our approach allows to link different storage systems (polystore) and limits the impact of ETL tools performing model transformations required to feed different analysis algorithms. Therefore, systems can take advantage of multiple data models both in terms of query execution performance and the semantic expressiveness of data representation. The proposed model allows to reach the logical independence between data and programs implementing analysis algorithms. With a concrete case study on message virality on Twitter during the French presidential election of 2017, we highlight some of the contributions of our model

arXiv.org e-Print Archive

Tensor network language model

Author: Pestun Vasily
Vlassopoulos Yiannis
Publication venue
Publication date: 30/10/2017
Field of study

We propose a new statistical model suitable for machine learning of systems with long distance correlations such as natural languages. The model is based on directed acyclic graph decorated by multi-linear tensor maps in the vertices and vector spaces in the edges, called tensor network. Such tensor networks have been previously employed for effective numerical computation of the renormalization group flow on the space of effective quantum field theories and lattice models of statistical mechanics. We provide explicit algebro-geometric analysis of the parameter moduli space for tree graphs, discuss model properties and applications such as statistical translation.Comment: 21 page

arXiv.org e-Print Archive

Knowledge Graph Embeddings and Explainable AI

Author: Bianchi Federico
Costabello Luca
Minervini Pasquale
Palmonari Matteo
Rossiello Gaetano
Publication venue: 'IOS Press'
Publication date: 30/04/2020
Field of study

Knowledge graph embeddings are now a widely adopted approach to knowledge representation in which entities and relationships are embedded in vector spaces. In this chapter, we introduce the reader to the concept of knowledge graph embeddings by explaining what they are, how they can be generated and how they can be evaluated. We summarize the state-of-the-art in this field by describing the approaches that have been introduced to represent knowledge in the vector space. In relation to knowledge representation, we consider the problem of explainability, and discuss models and methods for explaining predictions obtained via knowledge graph embeddings.Comment: Federico Bianchi, Gaetano Rossiello, Luca Costabello, Matteo Plamonari, Pasquale Minervini, Knowledge Graph Embeddings and Explainable AI. In: Ilaria Tiddi, Freddy Lecue, Pascal Hitzler (eds.), Knowledge Graphs for eXplainable AI -- Foundations, Applications and Challenges. Studies on the Semantic Web, IOS Press, Amsterdam, 202

arXiv.org e-Print Archive

TensorFlow: A system for large-scale machine learning

Author: Abadi Martín
Barham Paul
Chen Jianmin
Chen Zhifeng
Davis Andy
Dean Jeffrey
Devin Matthieu
Ghemawat Sanjay
Irving Geoffrey
Isard Michael
Kudlur Manjunath
Levenberg Josh
Monga Rajat
Moore Sherry
Murray Derek G.
Steiner Benoit
Tucker Paul
Vasudevan Vijay
Warden Pete
Wicke Martin
Yu Yuan
Zheng Xiaoqiang
Publication venue
Publication date: 31/05/2016
Field of study

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.Comment: 18 pages, 9 figures; v2 has a spelling correction in the metadat

arXiv.org e-Print Archive

Coupled Recurrent Models for Polyphonic Music Composition

Author: Foster Dean P.
Harchaoui Zaid
Kakade Sham M.
Thickstun John
Publication venue
Publication date: 26/11/2019
Field of study

This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music. We propose an efficient new conditional probabilistic factorization of musical scores, viewing a score as a collection of concurrent, coupled sequences: i.e. voices. To model the conditional distributions, we borrow ideas from both convolutional and recurrent neural models; we argue that these ideas are natural for capturing music's pitch invariances, temporal structure, and polyphony. We train models for single-voice and multi-voice composition on 2,300 scores from the KernScores dataset.Comment: 13 pages; long version of the paper appearing in ISMIR 201

arXiv.org e-Print Archive