This paper presents sub-word based language models for Amharic, a morphologically rich and under-resourced language. The language models have been developed (using an open source language modeling toolkit- SRILM) with different n-gram order (2 to 5) and smoothing techniques. Among the developed models, the best performing one is a 5gram model with modified Kneser-Ney smoothing and with interpolation of n-gram probability estimates. Keywords Language modeling, sub-word based language modeling, morph-based language modeling, Amharic
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.