975 research outputs found

    Innovative Techniques for Digitizing and Restoring Deteriorated Historical Documents

    Get PDF
    Recent large-scale document digitization initiatives have created new modes of access to modern library collections with the development of new hardware and software technologies. Most commonly, these digitization projects focus on accurately scanning bound texts, some reaching an efficiency of more than one million volumes per year. While vast digital collections are changing the way users access texts, current scanning paradigms can not handle many non-standard materials. Documentation forms such as manuscripts, scrolls, codices, deteriorated film, epigraphy, and rock art all hold a wealth of human knowledge in physical forms not accessible by standard book scanning technologies. This great omission motivates the development of new technology, presented by this thesis, that is not-only effective with deteriorated bound works, damaged manuscripts, and disintegrating photonegatives but also easily utilized by non-technical staff. First, a novel point light source calibration technique is presented that can be performed by library staff. Then, a photometric correction technique which uses known illumination and surface properties to remove shading distortions in deteriorated document images can be automatically applied. To complete the restoration process, a geometric correction is applied. Also unique to this work is the development of an image-based uncalibrated document scanner that utilizes the transmissivity of document substrates. This scanner extracts intrinsic document color information from one or both sides of a document. Simultaneously, the document shape is estimated to obtain distortion information. Lastly, this thesis provides a restoration framework for damaged photographic negatives that corrects photometric and geometric distortions. Current restoration techniques for the discussed form of negatives require physical manipulation to the photograph. The novel acquisition and restoration system presented here provides the first known solution to digitize and restore deteriorated photographic negatives without damaging the original negative in any way. This thesis work develops new methods of document scanning and restoration suitable for wide-scale deployment. By creating easy to access technologies, library staff can implement their own scanning initiatives and large-scale scanning projects can expand their current document-sets

    Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text

    Get PDF
    This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access a variety of historical documents published on the web. The paper presents an overview of the indexing process that will be used to achieve the goal, focusing on the adopted dewarping technique. The proposed dewarping approach performs its task with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The novelty introduced with this work regards the possibility of applying dewarping to document images which contain both handwritten and typewritten text

    A unified framework for document image restoration

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Advances in virtual archaeology. Research, preservation, and dissemination

    Get PDF
    The archaeological and palaeontological record (including human skeletal remains) often bears crack, damage and deformations. The recent rapid development of the diagnostic potentials of “virtual archaeology” has provided innovative tools to manage, study and preserve cultural and natural heritage. These tools include, among others, CT-scans, Laser-scanning, photogrammetry, 3D imaging and rapid prototyping. This approach can contribute to any archaeological context from its discovery to research, preservation, and dissemination. 3D imaging techniques, for instance, substitute physical intervention with a virtual protocol aimed at restoring the original shape of an archaeological item or a fossil specimen. In a similar way, the recovery of digital morphological information can be gathered using data preserved even on a deficient finding through the use of 3D comparative samples. Here we present an extended and updated review about the most innovative protocols applied in virtual archaeology and palaeontology

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    XDOCS: An Application to Index Historical Documents

    Get PDF
    Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten to digitalized documents presents several technical challenges. The XDOCS project is created with the main goal of making available and extending the usability of historical documents for a great variety of audience, like scholars, institutions and libraries. In this paper the core elements of XDOCS, i.e. page dewarping and word spotting technique, are described and two new applications, i.e. annotation/indexing and search tool, are presented
    corecore