Compressing Word Embeddings

DP Vinson; J Pennington; L Finkelstein; O Levy; S Lloyd; TL Griffiths

research

Compressing Word Embeddings

Authors: DP Vinson
J Pennington
L Finkelstein
O Levy
S Lloyd
TL Griffiths
Publication date: 16 May 2016
Publisher
Doi

Abstract

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic. However, these vector space representations (created through large-scale text analysis) are typically stored verbatim, since their internal structure is opaque. Using word-analogy tests to monitor the level of detail stored in compressed re-representations of the same vector space, the trade-offs between the reduction in memory usage and expressiveness are investigated. A simple scheme is outlined that can reduce the memory footprint of a state-of-the-art embedding by a factor of 10, with only minimal impact on performance. Then, using the same `bit budget', a binary (approximate) factorisation of the same space is also explored, with the aim of creating an equivalent representation with better interpretability.Comment: 10 pages, 0 figures, submitted to ICONIP-2016. Previous experimental results were submitted to ICLR-2016, but the paper has been significantly updated, since a new experimental set-up worked much bette

Similar works

Full text

Available Versions

Crossref

info:doi/10.1007%2F978-3-319-4...

Last time updated on 05/06/2019