Multi-language neural network language models

Abstract

Recently there has been a lot of interest in neural network based language models. These models typically consist of vocabulary dependent input and output layers and one or more vocabulary independent hidden layers. One standard issue with these approaches is that large quantities of training data are needed to ensure robust parameter estimates. This poses a significant problem when only limited data is available. One possible way to address this issue is augmentation: model-based, in the form of language model interpolation, and data-based, in the form of data augmentation. However, these approaches may not always be possible to use due to vocabulary dependent input and output layers. This seriously restricts the nature of the data possible to use in augmentation. This paper describes a general solution whereby only one or more vocabulary independent hidden layers are augmented. Such approach makes it possible to examine augmentation from previously impossible domains. Moreover, this approach paves a direct way for multi-task learning with these models. As a proof of the concept this paper examines the use of multilingual data for augmenting hidden layers of recurrent neural network language models. Experiments are conducted using a set of language packs released within IARPA Babel program

    Similar works