1,908 research outputs found
Predefined Sparseness in Recurrent Sequence Models
Inducing sparseness while training neural networks has been shown to yield
models with a lower memory footprint but similar effectiveness to dense models.
However, sparseness is typically induced starting from a dense model, and thus
this advantage does not hold during training. We propose techniques to enforce
sparseness upfront in recurrent sequence models for NLP applications, to also
benefit training. First, in language modeling, we show how to increase hidden
state sizes in recurrent layers without increasing the number of parameters,
leading to more expressive models. Second, for sequence labeling, we show that
word embeddings with predefined sparseness lead to similar performance as dense
embeddings, at a fraction of the number of trainable parameters.Comment: the SIGNLL Conference on Computational Natural Language Learning
(CoNLL, 2018
- …