Search CORE

7,944 research outputs found

Enriching Rare Word Representations in Neural Language Models by Embedding Matrix Augmentation

Author: Chng Eng Siong
Khassanov Yerbolat
Pham Van Tung
Xu Haihua
Zeng Zhiping
Publication venue: 'International Speech Communication Association'
Publication date: 31/07/2019
Field of study

The neural language models (NLM) achieve strong generalization capability by learning the dense representation of words and using them to estimate probability distribution function. However, learning the representation of rare words is a challenging problem causing the NLM to produce unreliable probability estimates. To address this problem, we propose a method to enrich representations of rare words in pre-trained NLM and consequently improve its probability estimation performance. The proposed method augments the word embedding matrices of pre-trained NLM while keeping other parameters unchanged. Specifically, our method updates the embedding vectors of rare words using embedding vectors of other semantically and syntactically similar words. To evaluate the proposed method, we enrich the rare street names in the pre-trained NLM and use it to rescore 100-best hypotheses output from the Singapore English speech recognition system. The enriched NLM reduces the word error rate by 6% relative and improves the recognition accuracy of the rare words by 16% absolute as compared to the baseline NLM.Comment: 5 pages, 2 figures, accepted to INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

Author: Amir Silvio
Astudillo Ramón Fernandez
Black Alan W.
Dyer Chris
Ling Wang
Luís Tiago
Marujo Luís
Trancoso Isabel
Publication venue
Publication date: 01/01/2015
Field of study

We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs. Relative to traditional word representation models that have independent vectors for each word type, our model requires only a single vector per character type and a fixed set of parameters for the compositional model. Despite the compactness of this model and, more importantly, the arbitrary nature of the form-function relationship in language, our "composed" word representations yield state-of-the-art results in language modeling and part-of-speech tagging. Benefits over traditional baselines are particularly pronounced in morphologically rich languages (e.g., Turkish)

arXiv.org e-Print Archive

Crossref