Use of Language Models in Handwriting Recognition

Abstract

Language models have been extensively used in natural language applications such as speech recognition, part-of-speech tagging, information extraction, etc. To a lesser extent the value of language models in text recognition has also been proved, e.g., recognition of poor quality printed text and the recognition of extended handwriting. This survey describes how linguistic context, particularly probabilistic language models, are used in the recognition of handwritten text. The survey begins with two handwriting recogniton techniques, segmentation-free and segmentation-based, are integrated with language models in the recognition process. Next, language models at the word level in the post processing step to improve the recognition results and at the character level for handwriting recognition 1 and correction of recognition results are described. Finally, syntax based techniques like lexical analysis using collocations, syntactic (n-gram) analysis using part-of-speech (POS) tags, and a hybrid syntactic technique comprised of both a statistical and an analytical component are described. Language modeling has been found to be very helpful for all natural language applications. They have been seen to improve the performance of these application by 25-50 % when the text used in training is representative of that for which the model is intended. 2 I

    Similar works

    Full text

    thumbnail-image

    Available Versions