28 research outputs found

    Seam Carving for Text Line Extraction on Color and Grayscale Historical Manuscripts

    Full text link

    Handwritten Arabic Documents Segmentation into Text Lines using Seam Carving

    Get PDF
    Inspired from human perception and common text documents characteristics based on readability constraints, an Arabic text line segmentation approach is proposed using seam carving. Taking the gray scale of the image as input data, this technique offers better results at extracting handwritten text lines without the need for the binary representation of the document image. In addition to its fast processing time, its versatility permits to process a multitude of document types, especially documents presenting low text-to-background contrast such as degraded historical manuscripts or complex writing styles like cursive handwriting. Even if our focus in this paper was on Arabic text segmentation, this method is language independent. Tests on a public database of 123 handwritten Arabic documents showed a line detection rate of 97.5% for a matching score of 90%

    Handwriting Recognition of Historical Documents with few labeled data

    Full text link
    Historical documents present many challenges for offline handwriting recognition systems, among them, the segmentation and labeling steps. Carefully annotated textlines are needed to train an HTR system. In some scenarios, transcripts are only available at the paragraph level with no text-line information. In this work, we demonstrate how to train an HTR system with few labeled data. Specifically, we train a deep convolutional recurrent neural network (CRNN) system on only 10% of manually labeled text-line data from a dataset and propose an incremental training procedure that covers the rest of the data. Performance is further increased by augmenting the training set with specially crafted multiscale data. We also propose a model-based normalization scheme which considers the variability in the writing scale at the recognition phase. We apply this approach to the publicly available READ dataset. Our system achieved the second best result during the ICDAR2017 competition

    Segmentation of Arabic Handwritten Documents into Text Lines using Watershed Transform

    Get PDF
    A crucial task in character recognition systems is the segmentation of the document into text lines and especially if it is handwritten. When dealing with non-Latin document such as Arabic, the challenge becomes greater since in addition to the variability of writing, the presence of diacritical points and the high number of ascender and descender characters complicates more the process of the segmentation. To remedy with this complexity and even to make this difficulty an advantage since the focus is on the Arabic language which is semi-cursive in nature, a method based on the Watershed Transform technique is proposed. Tested on «Handwritten Arabic Proximity Datasets» a segmentation rate of 93% for a 95% of matching score is achieved

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method

    Sparse Gradient Optimization and its Applications in Image Processing

    Get PDF
    Millions of digital images are captured by imaging devices on a daily basis. The way imaging devices operate follows an integral process from which the information of the original scene needs to be estimated. The estimation is done by inverting the integral process of the imaging device with the use of optimization techniques. This linear inverse problem, the inversion of the integral acquisition process, is at the heart of several image processing applications such as denoising, deblurring, inpainting, and super-resolution. We describe in detail the use of linear inverse problems in these applications. We review and compare several state-of-the-art optimization algorithms that invert this integral process. Linear inverse problems are usually very difficult to solve. Therefore, additional prior assumptions need to be introduced to successfully estimate the output signal. Several priors have been suggested in the research literature, with the Total Variation (TV) being one of the most prominent. In this thesis, we review another prior, the l0 pseudo-norm over the gradient domain. This prior allows full control over how many non-zero gradients are retained to approximate prominent structures of the image. We show the superiority of the l0 gradient prior over the TV prior in recovering genuinely piece-wise constant signals. The l0 gradient prior has shown to produce state-of-the-art results in edge-preserving image smoothing. Moreover, this general prior can be applied to several other applications, such as edge extraction, clip-art JPEG artifact removal, non-photorealistic image rendering, detail magnification, and tone mapping. We review and evaluate several state-of-the-art algorithms that solve the optimization problem based on the l0 gradient prior. Subsequently we apply the l0 gradient prior to two applications where we show superior results as compared to the current state-of-the-art. The first application is that of single-image reflection removal. Existing solutions to this problem have shown limited success because of the highly ill-posed nature of the problem. We show that the standard l0 gradient prior with a modified data-fidelity term based on the Laplacian operator is able to sufficiently remove unwanted reflections from images in many realistic scenarios. We conduct extensive experiments and show that our method outperforms the state-of-the-art. In the second application of haze removal from visible-NIR image pairs we propose a novel optimization framework, where the prior term penalizes the number of non-zero gradients of the difference between the output and the NIR image. Due to the longer wavelengths of NIR, an image taken in the NIR spectrum suffers significantly less from haze artifacts. Using this prior term, we are able to transfer details from the haze-free NIR image to the final result. We show that our formulation provides state-of-the-art results compared to haze removal methods that use a single image and also to those that are based on visible-NIR image pairs

    Arabic Handwritten Documents Segmentation into Text-lines and Words using Deep Learning

    Get PDF
    International audienceOne of the most important steps in a handwriting recognition system is text-line and word segmentation. But, this step is made difficult by the differences in handwriting styles, problems of skewness, overlapping and touching of text and the fluctuations of text-lines. It is even more difficult for ancient and calligraphic writings, as in Arabic manuscripts, due to the cursive connection in Arabic text, the erroneous position of diacritic marks, the presence of ascending and descending letters, etc. In this work, we propose an effective segmentation of Arabic handwritten text into text-lines and words, using deep learning. For text-line segmentation, we used an RU-net which allows a pixel-wise classification to separate text-lines pixels from the background ones. For word segmentation, we resorted to the text-line transcription, as we have not got a ground truth at word level. A BLSTM-CTC (Bidirectional Long Short Term Memory followed by a Connectionist Temporal Classification) is then used to perform the mapping between the transcription and text-line image, avoiding the need of the input segmentation. A CNN (Convolutional Neural Network) precedes the BLST-CTC to extract the features and to feed the BLSTM with the essential of the text-line image. Tested on the standard KHATT Arabic database, the experimental results confirm a segmentation success rate of no less than 96.7% for text-lines and 80.1% for words

    Multi-script handwritten character recognition:Using feature descriptors and machine learning

    Get PDF
    corecore