8 research outputs found

    A Novel Approach to remove Ink Bleed through Degraded Document Images

    Get PDF
    There are numerous recorded reports which manages the corruption of paper because of paper maturing, foundation variety because of clamor, uneven brightening or dim spots and characterization of loss of literary data in debased archives like light introduction which causes blurring of content or ink chipping, Degradation of the written work medium i.e. clouded or missing content brought about because of the vicinity of mould,parasites,dampness or weakness in the medium,When ink has leaked through posterior or a page to front reasons seep through interference,whenever we digitize our record it may present some clamor curios which may make debasement the printed information.There are numerous corrupted yet truly vital old original copies and reports dispersed crosswise over libraries and chronicles the world over. Because of entry of time ink of rear begins to meddle with the ink of front side which hampers the intelligibility of archives. Be that as it may, because of the significance of such archives it is essential to restore such records. In this paper, different calculations are used in pre handling steps like Bernsen calculation, Improved Bernsen calculation, Canny edge discovery method are used for initialising the outcomes. Subsequently post preparing steps are proposed toward the end so that the calculation finishes up with improved and productive results

    Seam Carving for Text Line Extraction on Color and Grayscale Historical Manuscripts

    Full text link

    A Path Planning for Line Segmentation of Handwritten Documents

    Full text link

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method

    Neural text line extraction in historical documents: a two-stage clustering approach

    Get PDF
    Accessibility of the valuable cultural heritage which is hidden in countless scanned historical documents is the motivation for the presented dissertation. The developed (fully automatic) text line extraction methodology combines state-of-the-art machine learning techniques and modern image processing methods. It demonstrates its quality by outperforming several other approaches on a couple of benchmarking datasets. The method is already being used by a wide audience of researchers from different disciplines and thus contributes its (small) part to the aforementioned goal.Das Erschließen des unermesslichen Wissens, welches in unzähligen gescannten historischen Dokumenten verborgen liegt, bildet die Motivation für die vorgelegte Dissertation. Durch das Verknüpfen moderner Verfahren des maschinellen Lernens und der klassischen Bildverarbeitung wird in dieser Arbeit ein vollautomatisches Verfahren zur Extraktion von Textzeilen aus historischen Dokumenten entwickelt. Die Qualität wird auf verschiedensten Datensätzen im Vergleich zu anderen Ansätzen nachgewiesen. Das Verfahren wird bereits durch eine Vielzahl von Forschern verschiedenster Disziplinen genutzt

    Sparse Gradient Optimization and its Applications in Image Processing

    Get PDF
    Millions of digital images are captured by imaging devices on a daily basis. The way imaging devices operate follows an integral process from which the information of the original scene needs to be estimated. The estimation is done by inverting the integral process of the imaging device with the use of optimization techniques. This linear inverse problem, the inversion of the integral acquisition process, is at the heart of several image processing applications such as denoising, deblurring, inpainting, and super-resolution. We describe in detail the use of linear inverse problems in these applications. We review and compare several state-of-the-art optimization algorithms that invert this integral process. Linear inverse problems are usually very difficult to solve. Therefore, additional prior assumptions need to be introduced to successfully estimate the output signal. Several priors have been suggested in the research literature, with the Total Variation (TV) being one of the most prominent. In this thesis, we review another prior, the l0 pseudo-norm over the gradient domain. This prior allows full control over how many non-zero gradients are retained to approximate prominent structures of the image. We show the superiority of the l0 gradient prior over the TV prior in recovering genuinely piece-wise constant signals. The l0 gradient prior has shown to produce state-of-the-art results in edge-preserving image smoothing. Moreover, this general prior can be applied to several other applications, such as edge extraction, clip-art JPEG artifact removal, non-photorealistic image rendering, detail magnification, and tone mapping. We review and evaluate several state-of-the-art algorithms that solve the optimization problem based on the l0 gradient prior. Subsequently we apply the l0 gradient prior to two applications where we show superior results as compared to the current state-of-the-art. The first application is that of single-image reflection removal. Existing solutions to this problem have shown limited success because of the highly ill-posed nature of the problem. We show that the standard l0 gradient prior with a modified data-fidelity term based on the Laplacian operator is able to sufficiently remove unwanted reflections from images in many realistic scenarios. We conduct extensive experiments and show that our method outperforms the state-of-the-art. In the second application of haze removal from visible-NIR image pairs we propose a novel optimization framework, where the prior term penalizes the number of non-zero gradients of the difference between the output and the NIR image. Due to the longer wavelengths of NIR, an image taken in the NIR spectrum suffers significantly less from haze artifacts. Using this prior term, we are able to transfer details from the haze-free NIR image to the final result. We show that our formulation provides state-of-the-art results compared to haze removal methods that use a single image and also to those that are based on visible-NIR image pairs
    corecore