44 research outputs found

    Effective Geometric Restoration of Distorted Historical Document for Large-Scale Digitization

    Get PDF
    Due to storage conditions and material’s non-planar shape, geometric distortion of the 2-D content is widely present in scanned document images. Effective geometric restoration of these distorted document images considerably increases character recognition rate in large-scale digitisation. For large-scale digitisation of historical books, geometric restoration solutions expect to be accurate, generic, robust, unsupervised and reversible. However, most methods in the literature concentrate on improving restoration accuracy for specific distortion effect, but not their applicability in large-scale digitisation. This paper proposes an effective mesh based geometric restoration system, (GRLSD), for large-scale distorted historical document digitisation. In this system, an automatic mesh generation based dewarping tool is proposed to geometrically model and correct arbitrary warping historical documents. An XML based mesh recorder is proposed to record the mesh of distortion information for reversible use. A graphic user interface toolkit is designed to visually display and manually manipulate the mesh for improving geometric restoration accuracy. Experimental results show that the proposed automatic dewarping approach efficiently corrects arbitrarily warped historical documents, with an improved performance over several state-of-the-art geometric restoration methods. By using XML mesh recorder and GUI toolkit, the GRLSD system greatly aids users to flexibly monitor and correct ambiguous points of mesh for the prevention of damaging historical document images without distortions in large-scale digitalisation

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Digital Restoration of Damaged Historical Parchment

    Get PDF
    In this thesis we describe the development of a pipeline for digitally restoring damaged historical parchment. The work was carried out in collaboration with London Metropolitan Archives (LMA), who are in possession of an extremely valuable 17th century document called The Great Parchment Book. This book served as the focus of our project and throughout this thesis we demonstrate our methods on its folios. Our aim was to expose the content of the book in a legible form so that it can be properly catalogued and studied. Our approach begins by acquiring an accurate digitisation of the pages. We have developed our own 3D reconstruction pipeline detailed in Chapter 5 in which each parchment is imaged using a hand-held digital-SLR camera, and the resulting image set is used to generate a high-resolution textured 3D reconstruction of each parchment. Investigation into methods for flatting the parchments demonstrated an analogy with surface parametrization. Flattening the entire parchment globally with various existing parametrization algorithms is problematic, as discussed in Chapters 4, 6, and 7, since this approach is blind to the distortion undergone by the parchment. We propose two complementary approaches to deal with this issue. Firstly, exploiting the fact that a reader will only ever inspect a small area of the folio at a given time, we proposed a method for performing local undistortion of the parchments inside an interactive viewer application. The application, described in Chapter 6, allows a user to browse a parchment folio as the application un-distorts in real-time the area of the parchment currently under inspection. It also allows the user to refer back to the original image set of the parchment to help with resolving ambiguities in the reconstruction and to deal with issues of provenance. Secondly, we proposed a method for estimating the actual deformation undergone by each parchment when it was damaged by using cues in the text. Since the text was originally written in straight lines and in a roughly uniform script size, we can detect the the variation in text orientation and size and use this information to estimate the deformation. in Chapter 7 we then show how this deformation can be inverted by posing the problem as a Poisson mesh deformation, and solving it in a way that guarantees local injectivity, to generate a globally flattened and undistorted image of each folio. We also show how these images can optionally be colour corrected to remove the shading cues baked into the reconstruction texture, and the discolourations in the parchment itself, to further improve legibility and give a more complete impression that the parchment has been restored. The methods we have developed have been very well received by London Metropolitan Archives, as well the the larger archival community. We have used the methods to digitise the entire Great Parchment Book, and have demonstrated our global flattening method on eight folios. As of the time of writing of this thesis, our methods are being used to virtually restore all of the remaining folios of the Great Parchment Book. Staff at LMA are also investigating potential future directions by experimenting with other interesting documents in their collections, and are exploring the possibility of setting up a service which would give access to our methods to other archival institutions with similarly damaged documents

    Automatic reconstruction from serial sections

    Get PDF
    In many experiments in biological and medical research, serial sectioning of biological material is the only way to reveal the three dimensional (3D) structure and function. For a number of reasons other 3D imaging techniques, such as CT, MRI, and confocal microscopy, are not always adequate because they cannot provide the necessary resolution or contrast, or because the specimen is too large, or because the staining techniques require sectioning. Therefore for the foreseeable future reconstruction from serial sections will remain the only method for 3D investigations in many biomedical fields. Reconstruction is a difficult problem due to the loss of 3D alignment as the sections are cut and, more seriously, the systematic and random distortion caused by the sectioning and preparation processes.Many authors have reported how serial sections can be registered by means of fiducial markers or otherwise, but there have been only a few studies of automated correction of the sectioning distortions. In this thesis solutions to the registration problem are reviewed and discussed, and a solution to the warping problem, based on image pro¬ cessing techniques and the finite element method (FEM), is presented. The aim of this project was to develop a fully automatic method of reconstruction in order to provide a 3D atlas of mouse development as part of a gene expression database. For this purpose it is not necessary to warp the object so that it is identical to the original object, but to correct local distortions in the sections in order to produce a smooth representative mouse embryo. Furthermore the use of fiducial markers was not possible because the reconstructions were from already sectioned material.In this thesis we demonstrate a new method for warping serial sections. The sections are warped by applying forces to each section, where each section is modelled as a thin elastic plate. The deformation forces are determined from correspondences between sections which are calculated by combining match strengths and positional information. The equilibrium state which represents the reconstructed 3D image is calculated using the finite element method. Results of the application of these methods to paraffin wax and resin embedded sections of the mouse embryo are presented

    Human-Centric Machine Vision

    Get PDF
    Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans

    Field-based measurement of hydrodynamics associated with engineered in-channel structures: the example of fish pass assessment

    Get PDF
    The construction of fish passes has been a longstanding measure to improve river ecosystem status by ensuring the passability of weirs, dams and other in- channel structures for migratory fish. Many fish passes have a low biological effectiveness because of unsuitable hydrodynamic conditions hindering fish to rapidly detect the pass entrance. There has been a need for techniques to quantify the hydrodynamics surrounding fish pass entrances in order to identify those passes that require enhancement and to improve the design of new passes. This PhD thesis presents the development of a methodology for the rapid, spatially continuous quantification of near-pass hydrodynamics in the field. The methodology involves moving-vessel Acoustic Doppler Current Profiler (ADCP) measurements in order to quantify the 3-dimensional water velocity distribution around fish pass entrances. The approach presented in this thesis is novel because it integrates a set of techniques to make ADCP data robust against errors associated with the environmental conditions near engineered in-channel structures. These techniques provide solutions to (i) ADCP compass errors from magnetic interference, (ii) bias in water velocity data caused by spatial flow heterogeneity, (iii) the accurate ADCP positioning in locales with constrained line of sight to navigation satellites, and (iv) the accurate and cost-effective sensor deployment following pre-defined sampling strategies. The effectiveness and transferability of the methodology were evaluated at three fish pass sites covering conditions of low, medium and high discharge. The methodology outputs enabled a detailed quantitative characterisation of the fish pass attraction flow and its interaction with other hydrodynamic features. The outputs are suitable to formulate novel indicators of hydrodynamic fish pass attractiveness and they revealed the need to refine traditional fish pass design guidelines

    Digital Methods in the Humanities: Challenges, Ideas, Perspectives

    Get PDF
    Digital Humanities is a transformational endeavor that not only changes the perception, storage, and interpretation of information but also of research processes and questions. It also prompts new ways of interdisciplinary communication between humanities scholars and computer scientists. This volume offers a unique perspective on digital methods for and in the humanities. It comprises case studies from various fields to illustrate the challenge of matching existing textual research practices and digital tools. Problems and solutions with and for training tools as well as the adjustment of research practices are presented and discussed with an interdisciplinary focus

    Digital Methods in the Humanities

    Get PDF
    Digital Humanities is a transformational endeavor that not only changes the perception, storage, and interpretation of information but also of research processes and questions. It also prompts new ways of interdisciplinary communication between humanities scholars and computer scientists. This volume offers a unique perspective on digital methods for and in the humanities. It comprises case studies from various fields to illustrate the challenge of matching existing textual research practices and digital tools. Problems and solutions with and for training tools as well as the adjustment of research practices are presented and discussed with an interdisciplinary focus
    corecore