4 research outputs found

    3D geometric and optical modeling of warped document images from scanners

    No full text
    10.1109/CVPR.2005.11Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition1337-342PIVR

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold
    corecore