987 research outputs found

    Design and Implementation Recognition System for Handwritten Hindi/Marathi Document

    Get PDF
    In the present scenario most of the importance is given for the “paperless office” there by more and more communication and storage of documents is performed digitally. Documents and files which are present in Hindi and Marathi languages that were once stored physically on paper are now being converted into electronic form in order to facilitate quicker additions, searches, and modifications, as well as to prolong the life of such records. Because of this, there is a great demand of such software, which automatically extracts, analyze, recognize and store information from physical documents for later retrieval. Skew detection is used for text line position determination in Digitized documents, automated page orientation, and skew angle detection for binary document images, skew detection in handwritten scripts, in compensation for Internet audio applications and in the correction of scanned documents

    Logical segmentation for article extraction in digitized old newspapers

    Full text link
    Newspapers are documents made of news item and informative articles. They are not meant to be red iteratively: the reader can pick his items in any order he fancies. Ignoring this structural property, most digitized newspaper archives only offer access by issue or at best by page to their content. We have built a digitization workflow that automatically extracts newspaper articles from images, which allows indexing and retrieval of information at the article level. Our back-end system extracts the logical structure of the page to produce the informative units: the articles. Each image is labelled at the pixel level, through a machine learning based method, then the page logical structure is constructed up from there by the detection of structuring entities such as horizontal and vertical separators, titles and text lines. This logical structure is stored in a METS wrapper associated to the ALTO file produced by the system including the OCRed text. Our front-end system provides a web high definition visualisation of images, textual indexing and retrieval facilities, searching and reading at the article level. Articles transcriptions can be collaboratively corrected, which as a consequence allows for better indexing. We are currently testing our system on the archives of the Journal de Rouen, one of France eldest local newspaper. These 250 years of publication amount to 300 000 pages of very variable image quality and layout complexity. Test year 1808 can be consulted at plair.univ-rouen.fr.Comment: ACM Document Engineering, France (2012

    Kannada Character Recognition System A Review

    Full text link
    Intensive research has been done on optical character recognition ocr and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market, but most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of works on Indian language character recognition especially Kannada script among 12 major scripts in India. This paper presents a review of existing work on printed Kannada script and their results. The characteristics of Kannada script and Kannada Character Recognition System kcr are discussed in detail. Finally fusion at the classifier level is proposed to increase the recognition accuracy.Comment: 12 pages, 8 figure

    The SOLAIRE Project: A Gaze-Contingent System to Facilitate Reading for Patients with Scotomatas

    Get PDF
    Reading is a major issue for visually impaired patients suffering from a blind area in the fovea. Current systems to facilitate reading do not really benet from recent advances in computer science, such as computer vision and augmented reality. On the SOLAIRE project (Système d'Optimisation de la Lecture par Asservissement de l'Image au Regard), we develop an augmented reality system to help patients to read more easily, resulting from a strong interaction between ophthalmologists and researchers in visual neuroscience and computer science. The main idea in this project is to control the display of the text read with the gaze, taking into account the specic characteristics of the scotoma for every individual. This report describes the system

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Algorithms for document image skew estimation

    Full text link
    A new projection profile based skew estimation algorithm was developed. This algorithm extracts fiducial points representing character elements by decoding a JBIG compressed image without reconstructing the original image. These points are projected along parallel lines into an accumulator array to determine the maximum alignment and the corresponding skew angle. Methods for characterizing the performance of skew estimation techniques were also investigated. In addition to the new skew estimator, three projection based algorithms were implemented and tested using 1,246 single column text zones extracted from a sample of 460 page images. Linear regression analyses of the experimental results indicate that our new skew estimation algorithm performs competitively with the other three techniques. These analyses also show that estimators using connected components as a fiducial representation perform worse than the others on the entire set of text zones. It is shown that all of the algorithms are sensitive to typographical features. The number of text lines in a zone significantly affects the accuracy of the connected component based methods. We also developed two aggregate measures of skew for entire pages. Experiments performed on the 460 unconstrained pages indicate the need to filter non-text features from consideration. Graphic and noise elements from page images contribute a significant amount of the error for the JBIG algorithm

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

    Entzerrung von Textdokumenten unter Verwendung von Thin-Plate-Spline

    Get PDF
    Viele Verfahren identifizieren zunächst Textzeilen und ermitteln anschließend über die Dokumentrandeigenschaften die notwenige Korrekturtransformation. Das in dieser Arbeit vorgestellte Verfahren benötigt keine Dokumentrandeigenschaften, sondern ermittelt die Dokumentorientierung anhand der Buchstabenausrichtungen. Als Interpolationmethode wird Thin-Plate-Spline verwendet

    Characterizing Challenged Minnesota Ballots

    Get PDF
    Photocopies of the ballots challenged in the 2008 Minnesota elections, which constitute a public record, were scanned on a high-speed scanner and made available on a public radio website. The PDF files were downloaded, converted to TIF images, and posted on the PERFECT website. Based on a review of relevant image-processing aspects of paper-based election machinery and on additional statistics and observations on the posted sample data, robust tools were developed for determining the underlying grid of the targets on these ballots regardless of skew, clipping, and other degradations caused by high-speed copying and digitization. The accuracy and robustness of a method based on both index-marks and oval targets are demonstrated on 13,435 challenged ballot page images

    A novel method for extracting and recognizing logos

    Get PDF
    Nowadays, the high volume of archival documents has made it exigent to store documents in electronic databases. A text logo represents the ownership of the text, and different texts can be categorized by it; for this reason, different methods have been presented for extracting and recognizing logos. The methods presented earlier, suffer problems such as, error of logo detection and recognition and slow speed. The proposed method of this study is composed of three sections: In the first section, the exact position of the logo can be identified by the pyramidal tree structure and horizontal and vertical analysis, and in the second section, the logo can be extracted through the algorithm of the boundary extension of feature rectangles. In the third section, after normalizing the size of the logo and eliminating the skew angle, for feature extraction, we first blocked the region encompassing the logo, and then we extract a particular feature by the parameter of the center of gravity of connected component each block. Finally, we use the KNN classification for the recognition of the logo.DOI:http://dx.doi.org/10.11591/ijece.v2i5.129
    • …
    corecore