research

Segmentation of Thai handwritten text for automatic document retrieval

Abstract

There is a huge amount of documents in Thai government organizations. Although automatic document image retrieval systems in English have been proposed and developed, there are no specific system which is capable to retrieve relevant information from documents in Thai language. While matching words or optical character recognition (OCR) can be applied, segmentation of the words and characters is essential to separate them in the first place. There are also both printed and handwritten characters in Thai government documents which pose an additional challenge. While the printed texts can be segmented easily using classical approach, handwritten scripts are hard to separate. The objective of this paper is to present a survey of the existing methods which have been developed recently, and the segmentation techniques of document images in handling Thai printed and handwritten scripts

    Similar works