6,045 research outputs found
Probabilistic mathematical formula recognition using a 2D context-free graph grammar
We present a probabilistic framework for the mathematical expression recognition problem. The developed system is flexible in that its grammar can be extended easily thanks to its graph grammar which eliminates the need for specifying rule precedence. It is also optimal in the sense that all possible interpretations of the expressions are expanded without making early commitments or hard decisions. In this paper, we give an overview of the whole system and describe in detail the graph grammar and the parsing process used in the system, along with some preliminary results on character, structure and expression recognition performances
Recognition techniques for online Arabic handwriting recognition systems
Online recognition of Arabic handwritten text has been an on-going research problem for many years. Generally,
online text recognition field has been gaining more interest
lately due to the increasing popularity of hand-held computers, digital notebooks and advanced cellular phones. However, different techniques have been used to build several online handwritten recognition systems for Arabic text, such as Neural Networks, Hidden Markov Model, Template Matching and others. Most of the researches on online text recognition have divided the recognition system into these three main phases which are preprocessing phase, feature extraction phase and recognition phase which considers as the most important phase and the heart of the whole system. This paper presents and compares techniques that have been used to recognize the Arabic handwriting scripts in online recognition systems. Those techniques attempt to recognize Arabic handwritten words, characters, digits or strokes. The structure and strategy of those reviewed techniques are explained in this article. The strengths and weaknesses of using these techniques will also be discussed
Learning Model Structure from Data : an Application to On-Line Handwriting
We present a learning strategy for Hidden Markov Models that may be used to cluster handwriting sequences or to learn a character model by identifying its main writing styles. Our approach aims at learning both the structure and parameters of a Hidden Markov Model (HMM) from the data. A byproduct of this learning strategy is the ability to cluster signals and identify allograph. We provide experimental results on artificial data that demonstrate the possibility to learn from data HMM parameters and topology. For a given topology, our approach outperforms in some cases that we identify standard Maximum Likelihood learning scheme. We also apply our unsupervised learning scheme on on-line handwritten signals for allograph clustering as well as for learning HMM models for handwritten digit recognition
Image processing for the extraction of nutritional information from food labels
Current techniques for tracking nutritional data require undesirable amounts of either time or man-power. People must choose between tediously recording and updating dietary information or depending on unreliable crowd-sourced or costly maintained databases. Our project looks to overcome these pitfalls by providing a programming interface for image analysis that will read and report the information present on a nutrition label directly. Our solution involves a C++ library that combines image pre-processing, optical character recognition, and post-processing techniques to pull the relevant information from an image of a nutrition label. We apply an understanding of a nutrition label\u27s content and data organization to approach the accuracy of traditional data-entry methods. Our system currently provides around 80% accuracy for most label images, and we will continue to work to improve our accuracy
Text search engine for digitized historical book
Abstract. There’s need to digitalize numerous historical books and texts and make it possible to read them electronically. Also it is often wanted to preserve their original appearance, not just the text itself. For these operations there is a need for systems, which understand the books and text as they are and are able to distinguish the text information from other context. Traditional optical character recognition systems perform well when processing modern printed text, but they might face problems with old handwritten texts. These types of texts need to be analyzed with systems, which can analyse and segment the text areas well from other irrelevant information. That is why it is important, that the document image segmentation works well. This thesis focuses on manual rectification, automatic segmentation and text line search on document images in Orationes project. When the document images are segmented and text lines found, information from XML transcript is used to find characters and words from the segmented document images. Search engine was developed with with Python programmin language. Python was chosen to ensure high platform independency.Tekstinhakujärjestelmä digitoidulle historialliselle kirjalle. Tiivistelmä. Lukuisia historiallisia kirjoja halutaan digitalisoida ja siirtää sähköisesti luettaviksi. Usein ne halutaan myös säilyttää alkuperäisessä ulkoasussaan. Tällaista operaatiota varten tarvitaan järjestelmiä, jotka osaavat ymmärtää kirjat ja tekstit sellaisinaan ja osaavat erottaa tekstin kirjan muusta kontekstista. Perinteiset optiset kirjaimentunnistusmenetelmät suorituvat hyvin painettujen tekstien analysoinnista, mutta ongelmia aiheuttavat käsinkirjoitetut vanhat tekstit. Tällaisten tekstien kohdalla dokumenttikuvat pitää pystyä ensin analysoimaan hyvin ja erottelemaan tekstialueet muusta tekstin kannalta irrelevantista informaatiosta. Siksi onkin tärkeää, että dokumenttikuvan segmentaatio onnistuu hyvin. Tässä työssä keskitytään Orationes projektin dokumenttikuvien manuaaliseen suoristamiseen, segmentaatioon ja tekstirivien löytämiseen. Lisäksi segmentaation jälkeen segmentoidusta dokumenttikuvasta yritetään löytää haluttuja kirjaimia ja sanoja, dokumenttikuvan XML transkriptista saadun informaation avulla. Hakumoottori toteutettiin Python ohjelmointikielellä, jotta saavutettiin alustariippumattomuus hakumoottorille
Content Recognition and Context Modeling for Document Analysis and Retrieval
The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge.
In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting.
Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification.
Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features.
Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance
- …