Search CORE

71 research outputs found

Training Behavior of Sparse Neural Network Topologies

Author: Alford Simon
Kepner Jeremy
Milechin Lauren
Robinett Ryan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/12/2019
Field of study

Improvements in the performance of deep neural networks have often come through the design of larger and more complex networks. As a result, fast memory is a significant limiting factor in our ability to improve network performance. One approach to overcoming this limit is the design of sparse neural networks, which can be both very large and efficiently trained. In this paper we experiment training on sparse neural network topologies. We test pruning-based topologies, which are derived from an initially dense network whose connections are pruned, as well as RadiX-Nets, a class of network topologies with proven connectivity and sparsity properties. Results show that sparse networks obtain accuracies comparable to dense networks, but extreme levels of sparsity cause instability in training, which merits further study.Comment: 6 pages. Presented at the 2019 IEEE High Performance Extreme Computing (HPEC) Conference. Received "Best Paper" awar

arXiv.org e-Print Archive

Crossref

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

Author: Adolf Robert
Brooks David
Gupta Udit
Mitzenmacher Michael M.
Reagen Brandon
Rush Alexander M.
Wei Gu-Yeon
Publication venue
Publication date: 13/11/2017
Field of study

The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the cost of introducing random errors. Leveraging the ability of neural networks to tolerate these imperfections and by re-training around the errors, the proposed technique, Weightless, can compress DNN weights by up to 496x with the same model accuracy. This results in up to a 1.51x improvement over the state-of-the-art

arXiv.org e-Print Archive

UCL Discovery

Predefined Sparseness in Recurrent Sequence Models

Author: Deleu Johannes
Demeester Thomas
Develder Chris
Godin Fréderic
Publication venue
Publication date: 01/01/2018
Field of study

Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models. However, sparseness is typically induced starting from a dense model, and thus this advantage does not hold during training. We propose techniques to enforce sparseness upfront in recurrent sequence models for NLP applications, to also benefit training. First, in language modeling, we show how to increase hidden state sizes in recurrent layers without increasing the number of parameters, leading to more expressive models. Second, for sequence labeling, we show that word embeddings with predefined sparseness lead to similar performance as dense embeddings, at a fraction of the number of trainable parameters.Comment: the SIGNLL Conference on Computational Natural Language Learning (CoNLL, 2018

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography