13 research outputs found

    Generation of optimal binarisation output from ancient Thai manuscripts on palm leaves

    Get PDF
    Recently, several binarisation techniques have been proposed to process different kinds of ancient document images. While many well-known binarisation techniques are particularly suitable for certain types of document images, there is no specific guidelines on the determination of the appropriate type of image degradation, or characteristics of the image. In this paper, a novel method has been proposed to generate the optimal binary image from different binarised outputs from a document image. This approach is based on weight majority vote, and uncertain pixels are then determined based on local areas of the binarised images, by applying iteration of weight majority vote. Experiment over benchmark data set of the Document Image Binarization Contest (DIBCO) 2011 shows that the proposed method provided better performance than most well-known techniques. The proposed method has also been applied to ancient manuscripts on palm leaves from Thailand and this approach provided better results than binarised outputs from original binarisation techniques

    Text line segmentation of ancient Thai manuscripts on palm leaves

    Get PDF
    Segmentation of handwritten scripts with overlapping text is one of the challenging tasks in the pre-processing for document recognition and optical character recognition (OCR) systems. It is a significant step because errors in the recognition stage will occur if text lines are not separated accurately. This paper aims to address the problem of text line segmentation of ancient Thai manuscripts written on palm leaves, in particular dealing with the issue of overlapping characters. The proposed technique is based on an integration of a partial projection method and smooth horizontal histogram with recurrence in each column. The performance evaluation of the proposed technique was compared with a modified partial projection profile. The experimental results from this study show that the accuracy of the proposed technique has a better performance. This technique will help to resolve the problem of text line segmentation for ancient Thai manuscripts on palm leaves

    Segmentation of Thai handwritten text for automatic document retrieval

    Get PDF
    There is a huge amount of documents in Thai government organizations. Although automatic document image retrieval systems in English have been proposed and developed, there are no specific system which is capable to retrieve relevant information from documents in Thai language. While matching words or optical character recognition (OCR) can be applied, segmentation of the words and characters is essential to separate them in the first place. There are also both printed and handwritten characters in Thai government documents which pose an additional challenge. While the printed texts can be segmented easily using classical approach, handwritten scripts are hard to separate. The objective of this paper is to present a survey of the existing methods which have been developed recently, and the segmentation techniques of document images in handling Thai printed and handwritten scripts

    Using local maxima profile and Piece-Wise technique for line segmentation on Thai handwritten historical documents

    Get PDF
    This paper presents a new approach for segmenting text lines on Thai handwritten documents. The proposed technique is based on an Adaptive Local Connectivity Map concept using Piece-Wise Separating Lines. The algorithm is designed to solve problems in handwritten documents such as fluctuating text lines. Moreover, local maxima projection profile is used for enhancing the speed of extraction. The proposed algorithm consists of four steps. Firstly, Otsu algorithm is used to binarize the source image. Second, Piece-Wise Separating Lines is applied to derive the Adaptive Local Connectivity Map to show mask text lines. In the third step, local maxima projection profile is used as a guideline for extracting text lines. Finally, contour algorithm is used to identify the interested mask text line. The interested mask text is used to map with text image in order to extract the text lines. Analysis of experimental results on the King Rama 5 archive data indicated that the method has achieved a correct rate of 85.7%

    A combined method of segmentation for connected handwritten on palm leaf manuscripts

    No full text
    Character segmentation of handwriting of ancient manuscript is a crucial step in Optical Character Recognition (OCR) system. The segmentation of connected components is one of the factors that affects the performance of the recognition system. In order to improve the efficiency of OCR systems, segmentation of touching characters is a significant task. This paper proposes a combined method for the handling of connected handwritten segmentation of Thai-Noi script on palm leaf manuscripts. This combined process of handwritten segmentation estimates points of connected positions from two methods: (1) points of foreground detection: foreground skeleton is established first, the path and junction are then detected using Hough transform and histogram of projection profile; and (2) points of background detection: background skeleton is extracted, the top and bottom paths are then detected by using Hough transform. The proposed method has been implemented and experimental results show that the combined method obtained better segmentation of handwritten documents on palm leaves

    Grundlagen und Gestaltungsprinzipien lernbegleitender Diagnostik

    Get PDF
    Jürgens E. Grundlagen und Gestaltungsprinzipien lernbegleitender Diagnostik. Fordern und Fördern in der Sek I. Bildungsmanagement. 2020:1-16

    Comparing binarisation techniques for the processing of ancient manuscripts

    No full text
    Ancient manuscripts have been preserved by many organizations so as to protect these documents and retrieve traditional knowledge. With the advanced computer technology, digitized media is now commonly used to record these documents. One objective of such work is to develop an efficient image processing system that could be used to retrieve knowledge and information automatically from these ancient manuscripts. Binarization is a preprocessing technique used to extract text and characters from the manuscripts. The output is then used for further processes such as character recognition and knowledge extraction. This paper compares different binarization techniques that could be used for processing of ancient manuscripts. The aim is to improve the binarization techniques with the main objective of developing an automated preprocessing technique for ancient manuscript recognition and knowledge extraction
    corecore