Search CORE

33 research outputs found

Preprocessing Techniques in Character Recognition

Author: Yasser Alginahi
Publication venue: 'IntechOpen'
Publication date: 17/08/2010
Field of study

Digital imaging technology assessment: Digital document storage project

Author
Publication venue
Publication date
Field of study

An ongoing technical assessment and requirements definition project is examining the potential role of digital imaging technology at NASA's STI facility. The focus is on the basic components of imaging technology in today's marketplace as well as the components anticipated in the near future. Presented is a requirement specification for a prototype project, an initial examination of current image processing at the STI facility, and an initial summary of image processing projects at other sites. Operational imaging systems incorporate scanners, optical storage, high resolution monitors, processing nodes, magnetic storage, jukeboxes, specialized boards, optical character recognition gear, pixel addressable printers, communications, and complex software processes

NASA Technical Reports Server

Computer analysis of composite documents with non-uniform background.

Author: Alginahi Yasser
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

The motivation behind most of the applications of off-line text recognition is to convert data from conventional media into electronic media. Such applications are bank cheques, security documents and form processing. In this dissertation a document analysis system is presented to transfer gray level composite documents with complex backgrounds and poor illumination into electronic format that is suitable for efficient storage, retrieval and interpretation. The preprocessing stage for the document analysis system requires the conversion of a paper-based document to a digital bit-map representation after optical scanning followed by techniques of thresholding, skew detection, page segmentation and Optical Character Recognition (OCR). The system as a whole operates in a pipeline fashion where each stage or process passes its output to the next stage. The success of each stage guarantees that the operation of the system as a whole with no failures that may reduce the character recognition rate. By designing this document analysis system a new local bi-level threshold selection technique was developed for gray level composite document images with non-uniform background. The algorithm uses statistical and textural feature measures to obtain a feature vector for each pixel from a window of size (2 n + 1) x (2n + 1), where n ≥ 1. These features provide a local understanding of pixels from their neighbourhoods making it easier to classify each pixel into its proper class. A Multi-Layer Perceptron Neural Network is then used to classify each pixel value in the image. The results of thresholding are then passed to the block segmentation stage. The block segmentation technique developed is a feature-based method that uses a Neural Network classifier to automatically segment and classify the image contents into text and halftone images. Finally, the text blocks are passed into a Character Recognition (CR) system to transfer characters into an editable text format and the recognition results were compared to those obtained from a commercial OCR. The OCR system implemented uses pixel distribution as features extracted from different zones of the characters. A correlation classifier is used to recognize the characters. For the application of cheque processing, this system was used to read the special numerals of the optical barcode found in bank cheques. The OCR system uses a fuzzy descriptive feature extraction method with a correlation classifier to recognize these special numerals, which identify the bank institute and provides personal information about the account holder. The new local thresholding scheme was tested on a variety of composite document images with complex backgrounds. The results were very good compared to the results from commercial OCR software. This proposed thresholding technique is not limited to a specific application. It can be used on a variety of document images with complex backgrounds and can be implemented in any document analysis system provided that sufficient training is performed.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A445. Source: Dissertation Abstracts International, Volume: 66-02, Section: B, page: 1061. Advisers: Maher Sid-Ahmed; Majid Ahmadi. Thesis (Ph.D.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Laser scanner jitter characterization, page content analysis for optimal rendering, and understanding image graininess

Author: Chen Yi-Ting
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

In Chapter 1, the electrophotographic (EP) process is widely used in imaging systems such as laser printers and office copiers. In the EP process, laser scanner jitter is a common artifact that mainly appears along the scan direction due to the condition of polygon facets. Prior studies have not focused on the periodic characteristic of laser scanner jitter in terms of the modeling and analysis. This chapter addresses the periodic characteristic of laser scanner jitter in the mathematical model. In the Fourier domain, we derive an analytic expression for laser scanner jitter in general, and extend the expression assuming a sinusoidal displacement. This leads to a simple closed-form expression in terms of Bessel functions of the first kind. We further examine the relationship between the continuous-space halftone image and the periodic laser scanner jitter. The simulation results show that our proposed mathematical model predicts the phenomenon of laser scanner jitter effectively, when compared to the characterization using a test pattern, which consists of a flat field with 25% dot coverage However, there is some mismatches between the analytical spectrum and spectrum of the processed scanned test target. We improve experimental results by directly estimating the displacement instead of assuming a sinusoidal displacement. This gives a better prediction of the phenomenon of laser scanner jitter. ^ In Chapter 2, we describe a segmentation-based object map correction algorithm, which can be integrated in a new imaging pipeline for laser electrophotographic (EP) printers. This new imaging pipeline incorporates the idea of object-oriented halftoning, which applies different halftone screens to different regions of the page, to improve the overall print quality. In particular, smooth areas are halftoned with a low-frequency screen to provide more stable printing; whereas detail areas are halftoned with a high-frequency screen, since this will better reproduce the object detail. In this case, the object detail also serves to mask any print defects that arise from the use of a high frequency screen. These regions are defined by the initial object map, which is translated from the page description language (PDL). However, the information of object type obtained from the PDL may be incorrect. Some smooth areas may be labeled as raster causing them to be halftoned with a high frequency screen, rather than being labeled as vector, which would result in them being rendered with a low frequency screen. To correct the misclassification, we propose an object map correction algorithm that combines information from the incorrect object map with information obtained by segmentation of the continuous-tone RGB rasterized page image. Finally, the rendered image can be halftoned by the object-oriented halftoning approach, based on the corrected object map. Preliminary experimental results indicate the benefits of our algorithm combined with the new imaging pipeline, in terms of correction of misclassification errors. ^ In Chapter 3, we describe a study to understand image graininess. With the emergence of the high-end digital printing technologies, it is of interest to analyze the nature and causes of image graininess in order to understand the factors that prevent high-end digital presses from achieving the same print quality as commercial offset presses. We want to understand how image graininess relates to the halftoning technology and marking technology. This chapter provides three different approaches to understand image graininess. First, we perform a Fourier-based analysis of regular and irregular periodic, clustered-dot halftone textures. With high-end digital printing technology, irregular screens can be considered since they can achieve a better approximation to the screen sets used for commercial offset presses. This is due to the fact that the elements of the periodicity matrix of an irregular screen are rational numbers, rather than integers, which would be the case for a regular screen. From the analytical results, we show that irregular halftone textures generate new frequency components near the spectrum origin; and these frequency components are low enough to be visible to the human viewer. However, regular halftone textures do not have these frequency components. In addition, we provide a metric to measure the nonuniformity of a given halftone texture. The metric indicates that the nonuniformity of irregular halftone textures is higher than the nonuniformity of regular halftone textures. Furthermore, a method to visualize the nonuniformity of given halftone textures is described. The analysis shows that irregular halftone textures are grainier than regular halftone textures. Second, we analyze the regular and irregular periodic, clustered-dot halftone textures by calculating three spatial statistics. First, the disparity between lattice points generated by the periodicity matrix, and centroids of dot clusters are considered. Next, the area of dot clusters in regular and irregular halftone textures is considered. Third, the compactness of dot clusters in the regular and irregular halftone textures is calculated. The disparity of between centroids of irregular dot clusters and lattices points generated by the irregular screen is larger than the disparity of between centroids of regular dot clusters and lattices points generated by the regular screen. Irregular halftone textures have higher variance in the histogram of dot-cluster area. In addition, the compactness measurement shows that irregular dot clusters are less compact than regular dot clusters. But, a clustered-dot halftone algorithm wants to produce clustered-dot as compact as possible. Lastly, we exam the current marking technology by printing the same halftone pattern on different substrates, glossy and polyester media. The experimental results show that the current marking technology provides better print quality on glossy media than on polyester media. With above three different approaches, we conclude that the current halftoning technology introduces image graininess in the spatial domain because of the non-integer elements in the periodicity matrix of the irregular screen and the finite addressability of the marking engine. In addition, the geometric characteristics of irregular dot clusters is more irregular than the geometric characteristics of regular dot clusters. Finally, the marking technology provides inconsistency of print quality between substrates

Purdue E-Pubs

Adaptive Methods for Robust Document Image Understanding

Author: Konya Iuliu
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

bonndoc – Der Publikationsserver der Universität Bonn

Currency security and forensics: a survey

Author: Garhwal A
J. Chambers
Kankanhalli M
Yan W-Q
Publication venue: Springer
Publication date: 31/12/2013
Field of study

By its definition, the word currency refers to an agreed medium for exchange, a nation’s currency is the formal medium enforced by the elected governing entity. Throughout history, issuers have faced one common threat: counterfeiting. Despite technological advancements, overcoming counterfeit production remains a distant future. Scientific determination of authenticity requires a deep understanding of the raw materials and manufacturing processes involved. This survey serves as a synthesis of the current literature to understand the technology and the mechanics involved in currency manufacture and security, whilst identifying gaps in the current literature. Ultimately, a robust currency is desire

AUT Scholarly Commons

Adaptive Algorithms for Automated Processing of Document Images

Author: Agrawal Mudit
Publication venue
Publication date: 01/01/2011
Field of study

Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts

Digital Repository at the University of Maryland

Character Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

Directory of Open Access Books (DOAB)

High Capacity Analog Channels for Smart Documents

Author: Iqbal Taswar
Publication venue
Publication date: 04/08/2006
Field of study

Widely-used valuable hardcopy documents such as passports, visas, driving licenses, educational certificates, entrance-passes for entertainment events etc. are conventionally protected against counterfeiting and data tampering attacks by applying analog security technologies (e.g. KINEGRAMS®, holograms, micro-printing, UV/IR inks etc.). How-ever, easy access to high quality, low price modern desktop publishing technology has left most of these technologies ineffective, giving rise to high quality false documents. The higher price and restricted usage are other drawbacks of the analog document pro-tection techniques. Digital watermarking and high capacity storage media such as IC-chips, optical data stripes etc. are the modern technologies being used in new machine-readable identity verification documents to ensure contents integrity; however, these technologies are either expensive or do not satisfy the application needs and demand to look for more efficient document protection technologies. In this research three different high capacity analog channels: high density data stripe (HD-DataStripe), data hiding in printed halftone images (watermarking), and super-posed constant background grayscale image (CBGI) are investigated for hidden com-munication along with their applications in smart documents. On way to develop high capacity analog channels, noise encountered from printing and scanning (PS) process is investigated with the objective to recover the digital information encoded at nearly maximum channel utilization. By utilizing noise behaviour, countermeasures against the noise are taken accordingly in data recovery process. HD-DataStripe is a printed binary image similar to the conventional 2-D barcodes (e.g. PDF417), but it offers much higher data storage capacity and is intended for machine-readable identity verification documents. The capacity offered by the HD-DataStripe is sufficient to store high quality biometric characteristics rather than extracted templates, in addition to the conventional bearer related data contained in a smart ID-card. It also eliminates the need for central database system (except for backup record) and other ex-pensive storage media, currently being used. While developing novel data-reading tech-nique for HD-DataStripe, to count for the unavoidable geometrical distortions, registra-tion marks pattern is chosen in such a way so that it results in accurate sampling points (a necessary condition for reliable data recovery at higher data encoding-rate). For more sophisticated distortions caused by the physical dot gain effects (intersymbol interfer-ence), the countermeasures such as application of sampling theorem, adaptive binariza-tion and post-data processing, each one of these providing only a necessary condition for reliable data recovery, are given. Finally, combining the various filters correspond-ing to these countermeasures, a novel Data-Reading technique for HD-DataStripe is given. The novel data-reading technique results in superior performance than the exist-ing techniques, intended for data recovery from printed media. In another scenario a small-size HD-DataStripe with maximum entropy is used as a copy detection pattern by utilizing information loss encountered at nearly maximum channel capacity. While considering the application of HD-DataStripe in hardcopy documents (contracts, official letters etc.), unlike existing work [Zha04], it allows one-to-one contents matching and does not depend on hash functions and OCR technology, constraints mainly imposed by the low data storage capacity offered by the existing analog media. For printed halftone images carrying hidden information higher capacity is mainly attributed to data-reading technique for HD-DataStripe that allows data recovery at higher printing resolution, a key requirement for a high quality watermarking technique in spatial domain. Digital halftoning and data encoding techniques are the other factors that contribute to data hiding technique given in this research. While considering security aspects, the new technique allows contents integrity and authenticity verification in the present scenario in which certain amount of errors are unavoidable, restricting the usage of existing techniques given for digital contents. Finally, a superposed constant background grayscale image, obtained by the repeated application of a specially designed small binary pattern, is used as channel for hidden communication and it allows up to 33 pages of A-4 size foreground text to be encoded in one CBGI. The higher capacity is contributed from data encoding symbols and data reading technique

Duisburg-Essen Publications Online

Geometric Layout Analysis of Scanned Documents

Author: Shafait Faisal
Publication venue
Publication date: 01/01/2008
Field of study

Layout analysis--the division of page images into text blocks, lines, and determination of their reading order--is a major performance limiting step in large scale document digitization projects. This thesis addresses this problem in several ways: it presents new performance measures to identify important classes of layout errors, evaluates the performance of state-of-the-art layout analysis algorithms, presents a number of methods to reduce the error rate and catastrophic failures occurring during layout analysis, and develops a statistically motivated, trainable layout analysis system that addresses the needs of large-scale document analysis applications. An overview of the key contributions of this thesis is as follows. First, this thesis presents an efficient local adaptive thresholding algorithm that yields the same quality of binarization as that of state-of-the-art local binarization methods, but runs in time close to that of global thresholding methods, independent of the local window size. Tests on the UW-1 dataset demonstrate a 20-fold speedup compared to traditional local thresholding techniques. Then, this thesis presents a new perspective for document image cleanup. Instead of trying to explicitly detect and remove marginal noise, the approach focuses on locating the page frame, i.e. the actual page contents area. A geometric matching algorithm is presented to extract the page frame of a structured document. It is demonstrated that incorporating page frame detection step into document processing chain results in a reduction in OCR error rates from 4.3% to 1.7% (n=4,831,618 characters) on the UW-III dataset and layout-based retrieval error rates from 7.5% to 5.3% (n=815 documents) on the MARG dataset. The performance of six widely used page segmentation algorithms (x-y cut, smearing, whitespace analysis, constrained text-line finding, docstrum, and Voronoi) on the UW-III database is evaluated in this work using a state-of-the-art evaluation methodology. It is shown that current evaluation scores are insufficient for diagnosing specific errors in page segmentation and fail to identify some classes of serious segmentation errors altogether. Thus, a vectorial score is introduced that is sensitive to, and identifies, the most important classes of segmentation errors (over-, under-, and mis-segmentation) and what page components (lines, blocks, etc.) are affected. Unlike previous schemes, this evaluation method has a canonical representation of ground truth data and guarantees pixel-accurate evaluation results for arbitrary region shapes. Based on a detailed analysis of the errors made by different page segmentation algorithms, this thesis presents a novel combination of the line-based approach by Breuel with the area-based approach of Baird which solves the over-segmentation problem in area-based approaches. This new approach achieves a mean text-line extraction error rate of 4.4% (n=878 documents) on the UW-III dataset, which is the lowest among the analyzed algorithms. This thesis also describes a simple, fast, and accurate system for document image zone classification that results from a detailed comparative analysis of performance of widely used features in document analysis and content-based image retrieval. Using a novel combination of known algorithms, an error rate of 1.46% (n=13,811 zones) is achieved on the UW-III dataset in comparison to a state-of-the-art system that reports an error rate of 1.55% (n=24,177 zones) using more complicated techniques. In addition to layout analysis of Roman script documents, this work also presents the first high-performance layout analysis method for Urdu script. For that purpose a geometric text-line model for Urdu script is presented. It is shown that the method can accurately extract Urdu text-lines from documents of different layouts like prose books, poetry books, magazines, and newspapers. Finally, this thesis presents a novel algorithm for probabilistic layout analysis that specifically addresses the needs of large-scale digitization projects. The presented approach models known page layouts as a structural mixture model. A probabilistic matching algorithm is presented that gives multiple interpretations of input layout with associated probabilities. An algorithm based on A* search is presented for finding the most likely layout of a page, given its structural layout model. For training layout models, an EM-like algorithm is presented that is capable of learning the geometric variability of layout structures from data, without the need for a page segmentation ground-truth. Evaluation of the algorithm on documents from the MARG dataset shows an accuracy of above 95% for geometric layout analysis.Geometrische Layout-Analyse von gescannten Dokumente

Kaiserslauterer uniweiter elektronischer Dokumentenserver