6 research outputs found

    Accuracy improvement in odia zip code recognition technique

    Get PDF
    Odia is a very popular language in India which is used by more than 45 million people worldwide, especially in the eastern region of India. The proposed recognition schemes for foreign languages such as Roman, Japanese, Chinese and Arabic can’t be applied directly for odia language because of the different structure of odia script. Hence, this report deals with the recognition of odia numerals with taking care of the varying style of handwriting. The main purpose is to apply the recognition scheme for zip code extraction and number plate recognition. Here, two methods “gradient and curvature method” and “box-method approach” are used to calculate the features of the preprocessed scanned image document. Features from both the methods are used to train the artificial neural network by taking a large no of samples from each numeral. Enough testing samples are used and results from both the features are compared. Principal component analysis has been applied to reduce the dimension of the feature vector so as to help further processing. The features from box-method of an unknown numeral are correlated with that of the standard numerals. While using neural networks, the average recognition accuracy using gradient and curvature features and box-method features are found to be 93.2 and 88.1 respectively

    INSERT from Reality: A Schema-driven Approach to Image Capture of Structured Information

    Get PDF
    3rd place in the undergraduate 3-Minute Thesis CompetitionThere is a tremendous amount of structured information locked away on document images, e.g., receipts, invoices, medical testing documents, and banking statements. However, the document images that retain this structured information are often ad hoc and vary between businesses, organizations, or time periods. Although optical character recognition allows us to digitize document images into sequences of words, there still does not exist a means to identify schema attributes in the words of these ad hoc images and extract them into a database. In this thesis, we push beyond optical character recognition: while current information extraction techniques use only optical character recognition from structured images, we infer the visual structure and combine it with the textual information on the document image to create a highly-structured INSERT statement, ready to be executed against a database. We call this approach IFR. We use OCR to obtain the textual contents of the image. Our natural language processes annotate this with relevant information such as data type. We also prune irrelevant words to improve performance in subsequent steps. In parallel to textual analysis, we visually segment the input document image, with no a-priori information, to create a visual context window around each textual token. We merge the two analyses to augment the textual information with context from the visual context windows. Using analyst-defined heuristic functions, we can score each of these context-enabled entities to probabilistically construct the final INSERT statement. We evaluated IFR on three real-world datasets and were able to achieve F1 scores of over 83% in INSERT generation on these datasets, spending approximately 2 seconds per image on average. Comparing IFR to natural language processing approaches, such as regular expressions and conditional random fields, we found IFR to perform better at detecting the correct schema attributes. To compare IFR to a human baseline, we conducted a user study to find the human baseline of INSERT quality on our datasets and found IFR to produce INSERT statements that were comparable or exceeded that baseline.National Science FoundationNo embargoAcademic Major: Computer Science and Engineerin

    Development of image processing and vision systems with industrial applications

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Neural Network Based Systems for Handprint OCR Applications

    No full text
    this paper, we present a detailed exposition of our NN-based approach to a practical OCR problem. In particular, our focus is on a specific handprint recognition problem which meets three constraints. First, the data is on forms, which implies that the data March 20, 1998 DRAFT 4 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1999 of interest is found in predetermined locations. Second, the image quality is sufficient to provide legible images, which implies that the forms have adequate space to enter the required information and that scanning resolution is sufficient to resolve the text. Third, the material to be read from the form has specific, known content, which restricts the lexicon of expected answers. The success or failure of form-based OCR is strongly influenced by the degree to which the application adheres to these constraints. ood form design is essential to economical OCR-based data input and is application-specific. Highquality scanning and good image quality are necessary to lower processing cost and allow the use of electronic images without extensive cleanup processing. Prior knowledge of field content is important both for recognition correction and for the detection of human errors in placing information on forms. We describe the design of a NN classifier for handprint classification problems. This NN is an improved MLP with the enhancements coming from well-conditioned acobians, successive regularization, and Boltzmann pruning techniques. The enhanced MLP improves the error-reject performance on handprint classification problems by factors of 2 to 4 while at the same time reducing the complexity (number of non-zero weights) of the network by about 40 to 60 . The effectiveness of the NN classifier is further illustrated by integrating..
    corecore