60 research outputs found

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    The rectification and recognition of document images with perspective and geometric distortions

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Final Closeout Report University Research Program in Robotics for Environmental Restoration and Waste Management

    Full text link

    On The Corporeal Exchange: Thai Boxing's Sacrificial Movement

    Get PDF
    This dissertation is an ethnographic study of Thai boxing (muay Thai) understood as sacrificial exchange, exploring the practice of this martial art in the context of contemporary Thai society. Drawing on two years of apprenticeship and participation research in Northeast Thailand and Bangkok, I consider the fighters’ integration in broader patterns of seasonal labor migration as they move between rural, regional tournaments and Bangkok stadiums. Focusing on the training of one particular boxer, I investigate interactions between trainers, managers, family, patrons and ancestral spirits. The boxers’ embodied actions as they unfold in time represent the sovereign relationship between living and dead, nature and culture, performatively establishing the boundaries between growth and decay. As the living move through a world of animate social relations, accruing debt, the boxer’s embodied patterns of repetition and exhaustion in training, and of destructive action in combat, create a possibility for shifting this balance, accruing merit for those otherwise occupied in handling materials which support the powerful, and transforming the established hierarchical order of everyday life. Against the background of the impermanent, closed, linear, cyclical or progressive temporalities of monasteries, factories, the military and the monarchy, the temporality of the ring remains open, giving fighters the elbow-room to performatively engage crucial symbols of life and death, male and female, human and animal, affording otherwise politically disempowered Northeastern Thai families the opportunity to create meaning and possibility in their lives. Acting as both victim and executioner, fighters accrue credit for the assembled audience, reinvesting each tier of the community with a degree of responsibility for life. I argue that these practices occur within a ‘deathworld’, in which the heightened attentiveness to the limited possibilities for action reaffirm the local position of the individual within the collective. With embodied motion that cuts across local categories of stillness and mobility, the living and the dead, with ever-greater stamina, Thai boxers become increasingly valuable and credit-able, paying the debts, material and spiritual, that their assembled supporters have incurred as they live their kinetically excessive lives, allowing men throughout the community to remain accountable to Kings, Buddha, ancestors, factories and patrons.Doctor of Philosoph
    corecore