763 research outputs found

    OCRspell: An interactive spelling correction system for OCR errors in text

    Full text link
    In this thesis we describe a spelling correction system designed specifically for OCR (Optical Character Recognition) generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well

    Document preprocessing and fuzzy unsupervised character classification

    Get PDF
    This dissertation presents document preprocessing and fuzzy unsupervised character classification for automatically reading daily-received office documents that have complex layout structures, such as multiple columns and mixed-mode contents of texts, graphics and half-tone pictures. First, the block segmentation algorithm is performed based on a simple two-step run-length smoothing to decompose a document into single-mode blocks. Next, the block classification is performed based on the clustering rules to classify each block into one of the types such as text, horizontal or vertical lines, graphics, and pictures. The mean white-to-black transition is shown as an invariance for textual blocks, and is useful for block discrimination. A fuzzy model for unsupervised character classification is designed to improve the robustness, correctness, and speed of the character recognition system. The classification procedures are divided into two stages. The first stage separates the characters into seven typographical categories based on word structures of a text line. The second stage uses pattern matching to classify the characters in each category into a set of fuzzy prototypes based on a nonlinear weighted similarity function. A fuzzy model of unsupervised character classification, which is more natural in the representation of prototypes for character matching, is defined and the weighted fuzzy similarity measure is explored. The characteristics of the fuzzy model are discussed and used in speeding up the classification process. After classification, the character recognition procedure is simply applied on the limited versions of the fuzzy prototypes. To avoid information loss and extra distortion, an topography-based approach is proposed to apply directly on the fuzzy prototypes to extract the skeletons. First, a convolution by a bell-shaped function is performed to obtain a smooth surface. Second, the ridge points are extracted by rule-based topographic analysis of the structure. Third, a membership function is assigned to ridge points with values indicating the degrees of membership with respect to the skeleton of an object. Finally, the significant ridge points are linked to form strokes of skeleton, and the clues of eigenvalue variation are used to deal with degradation and preserve connectivity. Experimental results show that our algorithm can reduce the deformation of junction points and correctly extract the whole skeleton although a character is broken into pieces. For some characters merged together, the breaking candidates can be easily located by searching for the saddle points. A pruning algorithm is then applied on each breaking position. At last, a multiple context confirmation can be applied to increase the reliability of breaking hypotheses

    A study of holistic strategies for the recognition of characters in natural scene images

    Get PDF
    Recognition and understanding of text in scene images is an important and challenging task. The importance can be seen in the context of tasks such as assisted navigation for the blind, providing directions to driverless cars, e.g. Google car, etc. Other applications include automated document archival services, mining text from images, and so on. The challenge comes from a variety of factors, like variable typefaces, uncontrolled imaging conditions, and various sources of noise corrupting the captured images. In this work, we study and address the fundamental problem of recognition of characters extracted from natural scene images, and contribute three holistic strategies to deal with this challenging task. Scene text recognition (STR) has been a known problem in computer vision and pattern recognition community for over two decades, and is still an active area of research owing to the fact that the recognition performance has still got a lot of room for improvement. Recognition of characters lies at the heart of STR and is a crucial component for a reliable STR system. Most of the current methods heavily rely on discriminative power of local features, such as histograms of oriented gradient (HoG), scale invariant feature transform (SIFT), shape contexts (SC), geometric blur (GB), etc. One of the problems with such methods is that the local features are rasterized in an ad hoc manner to get a single vector for subsequent use in recognition. This rearrangement of features clearly perturbs the spatial correlations that may carry crucial information vis-á-vis recognition. Moreover, such approaches, in general, do not take into account the rotational invariance property that often leads to failed recognition in cases where characters in scene images do not occur in upright position. To eliminate this local feature dependency and the associated problems, we propose the following three holistic solutions: The first one is based on modelling character images of a class as a 3-mode tensor and then factoring it into a set of rank-1 matrices and the associated mixing coefficients. Each set of rank-1 matrices spans the solution subspace of a specific image class and enables us to capture the required holistic signature for each character class along with the mixing coefficients associated with each character image. During recognition, we project each test image onto the candidate subspaces to derive its mixing coefficients, which are eventually used for final classification. The second approach we study in this work lets us form a novel holistic feature for character recognition based on active contour model, also known as snakes. Our feature vector is based on two variables, direction and distance, cumulatively traversed by each point as the initial circular contour evolves under the force field induced by the character image. The initial contour design in conjunction with cross-correlation based similarity metric enables us to account for rotational variance in the character image. Our third approach is based on modelling a 3-mode tensor via rotation of a single image. This is different from our tensor based approach described above in that we form the tensor using a single image instead of collecting a specific number of samples of a particular class. In this case, to generate a 3D image cube, we rotate an image through a predefined range of angles. This enables us to explicitly capture rotational variance and leads to better performance than various local approaches. Finally, as an application, we use our holistic model to recognize word images extracted from natural scenes. Here we first use our novel word segmentation method based on image seam analysis to split a scene word into individual character images. We then apply our holistic model to recognize individual letters and use a spell-checker module to get the final word prediction. Throughout our work, we employ popular scene text datasets, like Chars74K-Font, Chars74K-Image, SVT, and ICDAR03, which include synthetic and natural image sets, to test the performance of our strategies. We compare results of our recognition models with several baseline methods and show comparable or better performance than several local feature-based methods justifying thus the importance of holistic strategies

    PEOPLE COUNTING AND RUNNER IDENTIFICATION IN ATHLETIC RACES

    Get PDF
    The objective of this project is to create software capable of analyzing a video sequence of running competitions. The analysis consists of detecting the runners, tracking them with the intention of knowing their position when they cross the finish line and counting them. Another functionality of the system will be recognizing the bib numbers, thus making it possible for every runner to get their time. The software was developed studying different techniques of object detection, tracking and character recognition to try to choose the best for this specific application. A set of experiments has been performed to validate the proposed system

    PUBLIC OCR SIGN AGE RECOGNITION WITH SKEW & SLANT CORRECTION FOR VISUALLY IMP AIRED PEOPLE

    Get PDF
    This paper presents an OCR hybrid recognition model for the Visually Impaired People (VIP). The VIP often encounters problems navigating around independently because they are blind or have poor vision. They are always being discriminated due to their limitation which can lead to depression to the VIP. Thus, they require an efficient technological assistance to help them in their daily activity. The objective of this paper is to propose a hybrid model for Optical Character Recognition (OCR) to detect and correct skewed and slanted character of public signage. The proposed hybrid model should be able to integrate with speech synthesizer for VIP signage recognition. The proposed hybrid model will capture an image of a public signage to be converted into machine readable text in a text file. The text will then be read by a speech synthesizer and translated to voice as the output. In the paper, hybrid model which consist of Canny Method, Hough Transformation and Shearing Transformation are used to detect and correct skewed and slanted images. An experiment was conducted to test the hybrid model performance on 5 blind folded subjects. The OCR hybrid recognition model has successfully achieved a Recognition Rate (RR) of 82. 7%. This concept of public signage recognition is being proven by the proposed hybrid model which integrates OCR and speech synthesizer

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    A complete document analysis anda recognition system for GNU/Linux

    Get PDF
    Os motores de Reconhecimento Óptico de Caracteres (OCR) comuns simples­ mente "lêm" uma imagem não considerando a sua estrutura ou formatação. A formatação de um documento é um assunto muito importante na compreensão de um documento. Assim, o uso de motores de OCR não é suficiente para converter fielmente uma imagem de um documento para um formato electrónico. A Análise e Reconhecimento de Documentos (DAR) engloba a tarefa de reconhecer a estrutura de um documento o que, combinado com um motor de OCR, pode resultar numa conversão fiel de um documento para um formato editável. Estes sistemas existem como aplicações comerciais sem uma verdadeira equivalência em Software Livre actualmente e não estão disponíveis para o sistema operativo GNU/Linux. O trabalho descrito neste relatório tenta responder a este problema ao oferecer uma solução que combina componentes de Software Livre e sendo comparável, mesmo na sua fase inicial, a soluções comerciais disponíveis. /ABSTRACT; Regular OCR engines simply "read" an image not considering its structure or layout. A document's layout is a very important matter in the understanding of a document. Hence, using OCR engines is not enough to fairly convert an image of a document to an editable format. Document Analysis and Recognition (DAR) encompasses the task of recognizing a document's structure which combined with an OCR engine can result in a fair conversion of a document to an editable format. Such systems exist as commercial applications with no real equivalence in Free Software nowadays and are not available for the GNU/Linux operating system. The work described in this report attempts to answer this problem by offering a solution combining only Free Software components and being comparable, even in its early stage, to available commercial solutions
    • …
    corecore