1,160 research outputs found

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    A unified method for augmented incremental recognition of online handwritten Japanese and English text

    Get PDF
    We present a unifed method to augmented incremental recognition for online handwritten Japanese and English text, which is used for busy or on-the-fly recognition while writing, and lazy or delayed recognition after writing, without incurring long waiting times. It extends the local context for segmentation and recognition to a range of recent strokes called "segmentation scope" and "recognition scop", respectively. The recognition scope is inside of the segmentation scope. The augmented incremental recognition triggers recognition at every several recent strokes, updates the segmentation and recognition candidate lattice, and searches over the lattice for the best result incrementally. It also incorporates three techniques. The frst is to reuse the segmentation and recognition candidate lattice in the previous recognition scope for the current recognition scope. The second is to fx undecided segmentation points if they are stable between character/word patterns. The third is to skip recognition of partial candidate character/word patterns. The augmented incremental method includes the case of triggering recognition at every new stroke with the above-mentioned techniques. Experiments conducted on TUAT-Kondate and IAM online database show its superiority to batch recognition (recognizing text at one time) and pure incremental recognition (recognizing text at every input stroke) in processing time, waiting time, and recognition accuracy

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Novel Heuristic Recurrent Neural Network Framework to Handle Automatic Telugu Text Categorization from Handwritten Text Image

    Get PDF
    In the near future, the digitization and processing of the current paper documents describe efficient role in the creation of a paperless environment. Deep learning techniques for handwritten recognition have been extensively studied by various researchers. Deep neural networks can be trained quickly thanks to a lot of data and other algorithmic advancements. Various methods for extracting text from handwritten manuscripts have been developed in literature. To extract features from written Telugu Text image having some other neural network approaches like convolution neural network (CNN), recurrent neural networks (RNN), long short-term memory (LSTM). Different deep learning related approaches are widely used to identification of handwritten Telugu Text; various techniques are used in literature for the identification of Telugu Text from documents. For automatic identification of Telugu written script efficiently to eliminate noise and other semantic features present in Telugu Text, in this paper, proposes Novel Heuristic Advanced Neural Network based Telugu Text Categorization Model (NHANNTCM) based on sequence-to-sequence feature extraction procedure. Proposed approach extracts the features using RNN and then represents Telugu Text in sequence-to-sequence format for the identification advanced neural network performs both encoding and decoding to identify and explore visual features from sequence of Telugu Text in input data. The classification accuracy rates for Telugu words, Telugu numerals, Telugu characters, Telugu sentences, and the corresponding Telugu sentences were 99.66%, 93.63%, 91.36%, 99.05%, and 97.73% consequently. Experimental evaluation describe extracted with revealed which are textured i.e. TENG shown considerable operations in applications such as private information protection, security defense, and personal handwriting signature identification

    Pen-based Methods For Recognition and Animation of Handwritten Physics Solutions

    Get PDF
    There has been considerable interest in constructing pen-based intelligent tutoring systems due to the natural interaction metaphor and low cognitive load afforded by pen-based interaction. We believe that pen-based intelligent tutoring systems can be further enhanced by integrating animation techniques. In this work, we explore methods for recognizing and animating sketched physics diagrams. Our methodologies enable an Intelligent Tutoring System (ITS) to understand the scenario and requirements posed by a given problem statement and to couple this knowledge with a computational model of the student\u27s handwritten solution. These pieces of information are used to construct meaningful animations and feedback mechanisms that can highlight errors in student solutions. We have constructed a prototype ITS that can recognize mathematics and diagrams in a handwritten solution and infer implicit relationships among diagram elements, mathematics and annotations such as arrows and dotted lines. We use natural language processing to identify the domain of a given problem, and use this information to select one or more of four domain-specific physics simulators to animate the user\u27s sketched diagram. We enable students to use their answers to guide animation behavior and also describe a novel algorithm for checking recognized student solutions. We provide examples of scenarios that can be modeled using our prototype system and discuss the strengths and weaknesses of our current prototype. Additionally, we present the findings of a user study that aimed to identify animation requirements for physics tutoring systems. We describe a taxonomy for categorizing different types of animations for physics problems and highlight how the taxonomy can be used to define requirements for 50 physics problems chosen from a university textbook. We also present a discussion of 56 handwritten solutions acquired from physics students and describe how suitable animations could be constructed for each of them

    Freeform User Interfaces for Graphical Computing

    Get PDF
    報告番号: 甲15222 ; 学位授与年月日: 2000-03-29 ; 学位の種別: 課程博士 ; 学位の種類: 博士(工学) ; 学位記番号: 博工第4717号 ; 研究科・専攻: 工学系研究科情報工学専

    Text Extraction From Natural Scene: Methodology And Application

    Full text link
    With the popularity of the Internet and the smart mobile device, there is an increasing demand for the techniques and applications of image/video-based analytics and information retrieval. Most of these applications can benefit from text information extraction in natural scene. However, scene text extraction is a challenging problem to be solved, due to cluttered background of natural scene and multiple patterns of scene text itself. To solve these problems, this dissertation proposes a framework of scene text extraction. Scene text extraction in our framework is divided into two components, detection and recognition. Scene text detection is to find out the regions containing text from camera captured images/videos. Text layout analysis based on gradient and color analysis is performed to extract candidates of text strings from cluttered background in natural scene. Then text structural analysis is performed to design effective text structural features for distinguishing text from non-text outliers among the candidates of text strings. Scene text recognition is to transform image-based text in detected regions into readable text codes. The most basic and significant step in text recognition is scene text character (STC) prediction, which is multi-class classification among a set of text character categories. We design robust and discriminative feature representations for STC structure, by integrating multiple feature descriptors, coding/pooling schemes, and learning models. Experimental results in benchmark datasets demonstrate the effectiveness and robustness of our proposed framework, which obtains better performance than previously published methods. Our proposed scene text extraction framework is applied to 4 scenarios, 1) reading print labels in grocery package for hand-held object recognition; 2) combining with car detection to localize license plate in camera captured natural scene image; 3) reading indicative signage for assistant navigation in indoor environments; and 4) combining with object tracking to perform scene text extraction in video-based natural scene. The proposed prototype systems and associated evaluation results show that our framework is able to solve the challenges in real applications

    Sketch recognition of digital ink diagrams : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    Figures are either re-used with permission, or abstracted with permission from the source article.Sketch recognition of digital ink diagrams is the process of automatically identifying hand-drawn elements in a diagram. This research focuses on the simultaneous grouping and recognition of shapes in digital ink diagrams. In order to recognise a shape, we need to group strokes belonging to a shape, however, strokes cannot be grouped until the shape is identified. Therefore, we treat grouping and recognition as a simultaneous task. Our grouping technique uses spatial proximity to hypothesise shape candidates. Many of the hypothesised shape candidates are invalid, therefore we need a way to reject them. We present a novel rejection technique based on novelty detection. The rejection method uses proximity measures to validate a shape candidate. In addition, we investigate on improving the accuracy of the current shape recogniser by adding extra features. We also present a novel connector recognition system that localises connector heads around recognised shapes. We perform a full comparative study on two datasets. The results show that our approach is significantly more accurate in finding shapes and faster on process diagram compared to Stahovich et al. (2014), which the results show the superiority of our approach in terms of computation time and accuracy. Furthermore, we evaluate our system on two public datasets and compare our results with other approaches reported in the literature that have used these dataset. The results show that our approach is more accurate in finding and recognising the shapes in the FC dataset (by finding and recognising 91.7% of the shapes) compared to the reported results in the literature
    corecore