416 research outputs found

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Advanced Image Acquisition, Processing Techniques and Applications

    Get PDF
    "Advanced Image Acquisition, Processing Techniques and Applications" is the first book of a series that provides image processing principles and practical software implementation on a broad range of applications. The book integrates material from leading researchers on Applied Digital Image Acquisition and Processing. An important feature of the book is its emphasis on software tools and scientific computing in order to enhance results and arrive at problem solution

    Artificial neural network and its applications in quality process control, document recognition and biomedical imaging

    Get PDF
    In computer-vision based system a digital image obtained by a digital camera would usually have 24-bit color image. The analysis of an image with that many levels might require complicated image processing techniques and higher computational costs. But in real-time application, where a part has to be inspected within a few milliseconds, either we have to reduce the image to a more manageable number of gray levels, usually two levels (binary image), and at the same time retain all necessary features of the original image or develop a complicated technique. A binary image can be obtained by thresholding the original image into two levels. Therefore, thresholding of a given image into binary image is a necessary step for most image analysis and recognition techniques. In this thesis, we have studied the effectiveness of using artificial neural network (ANN) in pharmaceutical, document recognition and biomedical imaging applications for image thresholding and classification purposes. Finally, we have developed edge-based, ANN-based and region-growing based image thresholding techniques to extract low contrast objects of interest and classify them into respective classes in those applications. Real-time quality inspection of gelatin capsules in pharmaceutical applications is an important issue from the point of view of industry\u27s productivity and competitiveness. Computer vision-based automatic quality inspection and controller system is one of the solutions to this problem. Machine vision systems provide quality control and real-time feedback for industrial processes, overcoming physical limitations and subjective judgment of humans. In this thesis, we have developed an image processing system using edge-based image thresholding techniques for quality inspection that satisfy the industrial requirements in pharmaceutical applications to pass the accepted and rejected capsules. In document recognition application, success of OCR mostly depends on the quality of the thresholded image. Non-uniform illumination, low contrast and complex background make it challenging in this application. In this thesis, optimal parameters for ANN-based local thresholding approach for gray scale composite document image with non-uniform background is proposed. An exhaustive search was conducted to select the optimal features and found that pixel value, mean and entropy are the most significant features at window size 3x3 in this application. For other applications, it might be different, but the procedure to find the optimal parameters is same. The average recognition rate 99.25% shows that the proposed 3 features at window size 3x3 are optimal in terms of recognition rate and PSNR compare to the ANN-based thresholding technique with different parameters presented in the literature. In biomedical imaging application, breast cancer continues to be a public health problem. In this thesis we presented a computer aided diagnosis (CAD) system for mass detection and classification in digitized mammograms, which performs mass detection on regions of interest (ROI) followed by the benign-malignant classification on detected masses. Three layers ANN with seven features is proposed for classifying the marked regions into benign and malignant and 90.91% sensitivity and 83.87% specificity is achieved that is very much promising compare to the radiologist\u27s sensitivity 75%

    Automatic Vehicle Detection and Identification using Visual Features

    Get PDF
    In recent decades, a vehicle has become the most popular transportation mechanism in the world. High accuracy and success rate are key factors in automatic vehicle detection and identification. As the most important label on vehicles, the license plate serves as a mean of public identification for them. However, it can be stolen and affixed to different vehicles by criminals to conceal their identities. Furthermore, in some cases, the plate numbers can be the same for two vehicles coming from different countries. In this thesis, we propose a new vehicle identification system that provides high degree of accuracy and success rates. The proposed system consists of four stages: license plate detection, license plate recognition, license plate province detection and vehicle shape detection. In the proposed system, the features are converted into local binary pattern (LBP) and histogram of oriented gradients (HOG) as training dataset. To reach high accuracy in real-time application, a novel method is used to update the system. Meanwhile, via the proposed system, we can store the vehicles features and information in the database. Additionally, with the database, the procedure can automatically detect any discrepancy between license plate and vehicles

    Template Protection For 3D Face Recognition

    Get PDF
    The human face is one of the most important biometric modalities for automatic authentication. Three-dimensional face recognition exploits facial surface information. In comparison to illumination based 2D face recognition, it has good robustness and high fake resistance, so that it can be used in high security areas. Nevertheless, as in other common biometric systems, potential risks of identity theft, cross matching and exposure of privacy information threaten the security of the authentication system as well as the user\\u27s privacy. As a crucial supplementary of biometrics, the template protection technique can prevent security leakages and protect privacy. In this chapter, we show security leakages in common biometric systems and give a detailed introduction on template protection techniques. Then the latest results of template protection techniques in 3D face recognition systems are presented. The recognition performances as well as the security gains are analyzed

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    A Vietnamese Handwritten Text Recognition Pipeline for Tetanus Medical Records

    Get PDF
    Machine learning techniques are successful for optical character recognition tasks, especially in recognizing handwriting. However, recognizing Vietnamese handwriting is challenging with the presence of extra six distinctive tonal symbols and vowels. Such a challenge is amplified given the handwriting of health workers in an emergency care setting, where staff is under constant pressure to record the well-being of patients. In this study, we aim to digitize the handwriting of Vietnamese health workers. We develop a complete handwritten text recognition pipeline that receives scanned documents, detects, and enhances the handwriting text areas of interest, transcribes the images into computer text, and finally auto-corrects invalid words and terms to achieve high accuracy. From experiments with medical documents written by 30 doctors and nurses from the Tetanus Emergency Care unit at the Hospital for Tropical Diseases, we obtain promising results of 2% and 12% for Character Error Rate and Word Error Rate, respectively

    Computer analysis of composite documents with non-uniform background.

    Get PDF
    The motivation behind most of the applications of off-line text recognition is to convert data from conventional media into electronic media. Such applications are bank cheques, security documents and form processing. In this dissertation a document analysis system is presented to transfer gray level composite documents with complex backgrounds and poor illumination into electronic format that is suitable for efficient storage, retrieval and interpretation. The preprocessing stage for the document analysis system requires the conversion of a paper-based document to a digital bit-map representation after optical scanning followed by techniques of thresholding, skew detection, page segmentation and Optical Character Recognition (OCR). The system as a whole operates in a pipeline fashion where each stage or process passes its output to the next stage. The success of each stage guarantees that the operation of the system as a whole with no failures that may reduce the character recognition rate. By designing this document analysis system a new local bi-level threshold selection technique was developed for gray level composite document images with non-uniform background. The algorithm uses statistical and textural feature measures to obtain a feature vector for each pixel from a window of size (2 n + 1) x (2n + 1), where n ≥ 1. These features provide a local understanding of pixels from their neighbourhoods making it easier to classify each pixel into its proper class. A Multi-Layer Perceptron Neural Network is then used to classify each pixel value in the image. The results of thresholding are then passed to the block segmentation stage. The block segmentation technique developed is a feature-based method that uses a Neural Network classifier to automatically segment and classify the image contents into text and halftone images. Finally, the text blocks are passed into a Character Recognition (CR) system to transfer characters into an editable text format and the recognition results were compared to those obtained from a commercial OCR. The OCR system implemented uses pixel distribution as features extracted from different zones of the characters. A correlation classifier is used to recognize the characters. For the application of cheque processing, this system was used to read the special numerals of the optical barcode found in bank cheques. The OCR system uses a fuzzy descriptive feature extraction method with a correlation classifier to recognize these special numerals, which identify the bank institute and provides personal information about the account holder. The new local thresholding scheme was tested on a variety of composite document images with complex backgrounds. The results were very good compared to the results from commercial OCR software. This proposed thresholding technique is not limited to a specific application. It can be used on a variety of document images with complex backgrounds and can be implemented in any document analysis system provided that sufficient training is performed.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A445. Source: Dissertation Abstracts International, Volume: 66-02, Section: B, page: 1061. Advisers: Maher Sid-Ahmed; Majid Ahmadi. Thesis (Ph.D.)--University of Windsor (Canada), 2004

    Historical Document Enhancement Using LUT Classification

    Get PDF
    The fast evolution of scanning and computing technologies in recent years has led to the creation of large collections of scanned historical documents. It is almost always the case that these scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to learn local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system, we have labeled a subset of the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. http://ir.iit.edu/collections/). This labeled subset was then used to train classifiers based on lookup tables in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly efficient and effective. Experimental evaluation results are provided using the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. http://ir.iit.edu/collections/). © Springer-Verlag 2009
    • …
    corecore