Sparse Coding of Neural Word Embeddings for Multilingual Sequence
  Labeling

Berend, Gábor

research

Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling

Authors: Gábor Berend
Publication date: 21 December 2016
Publisher

Abstract

In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e.~150 sentences per language

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

oai:publicatio.bibl.u-szeged.h...

Last time updated on 06/01/2019