320 research outputs found

    On virtual partitioning of large dictionaries for contextual post-processing to improve character recognition

    Get PDF
    This paper presents a new approach to the partitioning of large dictionaries by virtual views. The basic idea is that additional knowledge sources of text recognition and text analysis are employed for fast dictionary look-up in order to prune search space through static or dynamic views. The heart of the system is a redundant hashing technique which involves a set of hash functions dealing with noisy input efficiently. Currently, the system is composed of two main system components: the dictionary generator and the dictionary controller. While the dictionary generator initially builds the system by using profiles and source dictionaries, the controller allows the flexible integration of different search heuristics. Results prove that our system achieves a respectable speed-up of dictionary access time

    Text Extraction From Natural Scene: Methodology And Application

    Full text link
    With the popularity of the Internet and the smart mobile device, there is an increasing demand for the techniques and applications of image/video-based analytics and information retrieval. Most of these applications can benefit from text information extraction in natural scene. However, scene text extraction is a challenging problem to be solved, due to cluttered background of natural scene and multiple patterns of scene text itself. To solve these problems, this dissertation proposes a framework of scene text extraction. Scene text extraction in our framework is divided into two components, detection and recognition. Scene text detection is to find out the regions containing text from camera captured images/videos. Text layout analysis based on gradient and color analysis is performed to extract candidates of text strings from cluttered background in natural scene. Then text structural analysis is performed to design effective text structural features for distinguishing text from non-text outliers among the candidates of text strings. Scene text recognition is to transform image-based text in detected regions into readable text codes. The most basic and significant step in text recognition is scene text character (STC) prediction, which is multi-class classification among a set of text character categories. We design robust and discriminative feature representations for STC structure, by integrating multiple feature descriptors, coding/pooling schemes, and learning models. Experimental results in benchmark datasets demonstrate the effectiveness and robustness of our proposed framework, which obtains better performance than previously published methods. Our proposed scene text extraction framework is applied to 4 scenarios, 1) reading print labels in grocery package for hand-held object recognition; 2) combining with car detection to localize license plate in camera captured natural scene image; 3) reading indicative signage for assistant navigation in indoor environments; and 4) combining with object tracking to perform scene text extraction in video-based natural scene. The proposed prototype systems and associated evaluation results show that our framework is able to solve the challenges in real applications

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    An investigation into the use of linguistic context in cursive script recognition by computer

    Get PDF
    The automatic recognition of hand-written text has been a goal for over thirty five years. The highly ambiguous nature of cursive writing (with high variability between not only different writers, but even between different samples from the same writer), means that systems based only on visual information are prone to errors. It is suggested that the application of linguistic knowledge to the recognition task may improve recognition accuracy. If a low-level (pattern recognition based) recogniser produces a candidate lattice (i.e. a directed graph giving a number of alternatives at each word position in a sentence), then linguistic knowledge can be used to find the 'best' path through the lattice. There are many forms of linguistic knowledge that may be used to this end. This thesis looks specifically at the use of collocation as a source of linguistic knowledge. Collocation describes the statistical tendency of certain words to co-occur in a language, within a defined range. It is suggested that this tendency may be exploited to aid automatic text recognition. The construction and use of a post-processing system incorporating collocational knowledge is described, as are a number of experiments designed to test the effectiveness of collocation as an aid to text recognition. The results of these experiments suggest that collocational statistics may be a useful form of knowledge for this application and that further research may produce a system of real practical use

    Handwritten character recognition using a gradient based feature extraction

    Full text link
    Handwriting Recognition is the task of transforming a language that is represented in its spatial form of graphical marks into its symbolic representation. In Offline Handwriting Recognition, there are three steps: preprocessing of the image, segmentation of words into characters and recognition of the characters. In this thesis I implemented two methods for character recognition, which is the most important step in Offline Handwriting Recognition. The heart of character recognition is the features that are extracted from the character image. The accuracy of the classification of the character image depends on the quality of the features extracted from the image. The two methods presented in this thesis use two different types of features. One uses the connectivity features among various segments in a character image, and the other method uses the gradient feature at each pixel to construct the feature vectors. Both these methods are discussed in detail in the following chapters

    DH Benelux Journal 4. The Humanities in a Digital World

    Get PDF
    The fourth volume of the DH Benelux Journal. This volume includes seven full-length, peer-reviewed articles that are based on accepted contributions to the 2021 virtual DH Benelux conference. Contents: 1. Editors' Preface (Wout Dillen, Margherita Fantoli, Marijn Koolen, Marieke van Erp); 2. Introduction: The Humanities in a Digital World (Lorella Viola, Jelena Prokic, Antske Fokkens, Tommaso Caselli); 3. A Game of Persistence, Self-doubt, and Curiosity: Surveying Code Literacy in Digital Humanities (Elli Bleeker, Marijn Koolen, Kaspar Beelen, Liliana Melgar, Joris van Zundert, Sally Chambers); 4. Introducing the DHARPA Project: An Interdisciplinary Lab to Enable Critical DH Practice (Angela R. Cunningham, Helena Jaskov, Sean Takats, Lorella Viola); 5. Examining a Multi Layered Approach for Classification of OCR Quality without Ground Truth (Mirjam Cuper); 6. Modeling Ontologies for Individual Artists: A Case Study of a Dutch Ceramic Glass Sculptor (Victor de Boer, Daan Raven, Erik Esmeijer, Johan Oome); 7. Judging a Book by its Criticism: A Digital Analysis of the Professional and Community Driven Literary Criticism of the Ingeborg-Bachmann-Preis (Lore De Greve, Gunther Martens); 8. When No News is Bad News. News-Based Change Detection during COVID-19 (Kristoffer L. Nielbo, Frida Hæstrup, Kenneth C. Enevoldsen, Peter B. Vahlstrup, Rebekah B. Baglini, Andreas Roepstorff); 9. Combining Tools with Linked Data: a Social History Example (Ivo Zandhuis)
    corecore