Search CORE

22,374 research outputs found

Distilling Word Embeddings: An Encoding Approach

Author: Jia Ran
Jin Zhi
Li Ge
Mou Lili
Xu Yan
Zhang Lu
Publication venue
Publication date: 24/07/2016
Field of study

Distilling knowledge from a well-trained cumbersome network to a small one has recently become a new research topic, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling word embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from a set of high-dimensional embeddings, which can reduce model complexity by a large margin as well as retain high accuracy, showing a good compromise between efficiency and performance. Experiments in two tasks reveal the phenomenon that distilling knowledge from cumbersome embeddings is better than directly training neural networks with small embeddings.Comment: Accepted by CIKM-16 as a short paper, and by the Representation Learning for Natural Language Processing (RL4NLP) Workshop @ACL-16 for presentatio

arXiv.org e-Print Archive