1,725 research outputs found

    Deep Unrestricted Document Image Rectification

    Full text link
    In recent years, tremendous efforts have been made on document image rectification, but existing advanced algorithms are limited to processing restricted document images, i.e., the input images must incorporate a complete document. Once the captured image merely involves a local text region, its rectification quality is degraded and unsatisfactory. Our previously proposed DocTr, a transformer-assisted network for document image rectification, also suffers from this limitation. In this work, we present DocTr++, a novel unified framework for document image rectification, without any restrictions on the input distorted images. Our major technical improvements can be concluded in three aspects. Firstly, we upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing. Secondly, we reformulate the pixel-wise mapping relationship between the unrestricted distorted document images and the distortion-free counterparts. The obtained data is used to train our DocTr++ for unrestricted document image rectification. Thirdly, we contribute a real-world test set and metrics applicable for evaluating the rectification quality. To our best knowledge, this is the first learning-based method for the rectification of unrestricted document images. Extensive experiments are conducted, and the results demonstrate the effectiveness and superiority of our method. We hope our DocTr++ will serve as a strong baseline for generic document image rectification, prompting the further advancement and application of learning-based algorithms. The source code and the proposed dataset are publicly available at https://github.com/fh2019ustc/DocTr-Plus

    Document image restoration - For document images scanned from bound volumes -

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Computer Assisted Relief Generation - a Survey

    Get PDF
    In this paper we present an overview of the achievements accomplished to date in the field of computer aided relief generation. We delineate the problem, classify the different solutions, analyze similarities, investigate the evelopment and review the approaches according to their particular relative strengths and weaknesses. In consequence this survey is likewise addressed to researchers and artists through providing valuable insights into the theory behind the different concepts in this field and augmenting the options available among the methods presented with regard to practical application

    3D MODELLING AND RAPID PROTOTYPING FOR CARDIOVASCULAR SURGICAL PLANNING – TWO CASE STUDIES

    Get PDF
    In the last years, cardiovascular diagnosis, surgical planning and intervention have taken advantages from 3D modelling and rapid prototyping techniques. The starting data for the whole process is represented by medical imagery, in particular, but not exclusively, computed tomography (CT) or multi-slice CT (MCT) and magnetic resonance imaging (MRI). On the medical imagery, regions of interest, i.e. heart chambers, valves, aorta, coronary vessels, etc., are segmented and converted into 3D models, which can be finally converted in physical replicas through 3D printing procedure. In this work, an overview on modern approaches for automatic and semiautomatic segmentation of medical imagery for 3D surface model generation is provided. The issue of accuracy check of surface models is also addressed, together with the critical aspects of converting digital models into physical replicas through 3D printing techniques. A patient-specific 3D modelling and printing procedure (Figure 1), for surgical planning in case of complex heart diseases was developed. The procedure was applied to two case studies, for which MCT scans of the chest are available. In the article, a detailed description on the implemented patient-specific modelling procedure is provided, along with a general discussion on the potentiality and future developments of personalized 3D modelling and printing for surgical planning and surgeons practice

    Effective Geometric Restoration of Distorted Historical Document for Large-Scale Digitization

    Get PDF
    Due to storage conditions and material’s non-planar shape, geometric distortion of the 2-D content is widely present in scanned document images. Effective geometric restoration of these distorted document images considerably increases character recognition rate in large-scale digitisation. For large-scale digitisation of historical books, geometric restoration solutions expect to be accurate, generic, robust, unsupervised and reversible. However, most methods in the literature concentrate on improving restoration accuracy for specific distortion effect, but not their applicability in large-scale digitisation. This paper proposes an effective mesh based geometric restoration system, (GRLSD), for large-scale distorted historical document digitisation. In this system, an automatic mesh generation based dewarping tool is proposed to geometrically model and correct arbitrary warping historical documents. An XML based mesh recorder is proposed to record the mesh of distortion information for reversible use. A graphic user interface toolkit is designed to visually display and manually manipulate the mesh for improving geometric restoration accuracy. Experimental results show that the proposed automatic dewarping approach efficiently corrects arbitrarily warped historical documents, with an improved performance over several state-of-the-art geometric restoration methods. By using XML mesh recorder and GUI toolkit, the GRLSD system greatly aids users to flexibly monitor and correct ambiguous points of mesh for the prevention of damaging historical document images without distortions in large-scale digitalisation

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Redefining A in RGBA: Towards a Standard for Graphical 3D Printing

    Full text link
    Advances in multimaterial 3D printing have the potential to reproduce various visual appearance attributes of an object in addition to its shape. Since many existing 3D file formats encode color and translucency by RGBA textures mapped to 3D shapes, RGBA information is particularly important for practical applications. In contrast to color (encoded by RGB), which is specified by the object's reflectance, selected viewing conditions and a standard observer, translucency (encoded by A) is neither linked to any measurable physical nor perceptual quantity. Thus, reproducing translucency encoded by A is open for interpretation. In this paper, we propose a rigorous definition for A suitable for use in graphical 3D printing, which is independent of the 3D printing hardware and software, and which links both optical material properties and perceptual uniformity for human observers. By deriving our definition from the absorption and scattering coefficients of virtual homogeneous reference materials with an isotropic phase function, we achieve two important properties. First, a simple adjustment of A is possible, which preserves the translucency appearance if an object is re-scaled for printing. Second, determining the value of A for a real (potentially non-homogeneous) material, can be achieved by minimizing a distance function between light transport measurements of this material and simulated measurements of the reference materials. Such measurements can be conducted by commercial spectrophotometers used in graphic arts. Finally, we conduct visual experiments employing the method of constant stimuli, and derive from them an embedding of A into a nearly perceptually uniform scale of translucency for the reference materials.Comment: 20 pages (incl. appendices), 20 figures. Version with higher quality images: https://cloud-ext.igd.fraunhofer.de/s/pAMH67XjstaNcrF (main article) and https://cloud-ext.igd.fraunhofer.de/s/4rR5bH3FMfNsS5q (appendix). Supplemental material including code: https://cloud-ext.igd.fraunhofer.de/s/9BrZaj5Uh5d0cOU/downloa
    corecore