3,090 research outputs found

    Adaptive dissection based subword segmentation of printed Arabic text

    Get PDF
    Numerous segmentation and recognition techniques have been proposed in literature for Arabic OCR system. Correct and efficient segmentation of Arabic text into characters is considered to be a fundamental problem. While OCR systems for other languages do not need segmentation for printed text for successful recognition, it is essential to design robust and powerful segmentation algorithms or employ segmentation free recognition schemes for printed Arabic text. Even more, in recognition of handwritten characters, segmentation is considered to be indispensable. Most of current segmentation technique suffers from over segmentation and under segmentation in addition to not being adaptive in nature. In this paper, we have proposed a new sub-word segmentation scheme, which is independent of font size and font type

    Adaptive dissection based subword segmentation of printed Arabic text

    Get PDF
    Numerous segmentation and recognition techniques have been proposed in literature for Arabic OCR system. Correct and efficient segmentation of Arabic text into characters is considered to be a fundamental problem. While OCR systems for other languages do not need segmentation for printed text for successful recognition, it is essential to design robust and powerful segmentation algorithms or employ segmentation free recognition schemes for printed Arabic text. Even more, in recognition of handwritten characters, segmentation is considered to be indispensable. Most of current segmentation technique suffers from over segmentation and under segmentation in addition to not being adaptive in nature. In this paper, we have proposed a new sub-word segmentation scheme, which is independent of font size and font type

    A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

    Full text link
    Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors
    corecore