19,229 research outputs found

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Document Image Analysis Techniques for Handwritten Text Segmentation, Document Image Rectification and Digital Collation

    Get PDF
    Document image analysis comprises all the algorithms and techniques that are utilized to convert an image of a document to a computer readable description. In this work we focus on three such techniques, namely (1) Handwritten text segmentation (2) Document image rectification and (3) Digital Collation. Offline handwritten text recognition is a very challenging problem. Aside from the large variation of different handwriting styles, neighboring characters within a word are usually connected, and we may need to segment a word into individual characters for accurate character recognition. Many existing methods achieve text segmentation by evaluating the local stroke geometry and imposing constraints on the size of each resulting character, such as the character width, height and aspect ratio. These constraints are well suited for printed texts, but may not hold for handwritten texts. Other methods apply holistic approach by using a set of lexicons to guide and correct the segmentation and recognition. This approach may fail when the domain lexicon is insufficient. In the first part of this work, we present a new global non-holistic method for handwritten text segmentation, which does not make any limiting assumptions on the character size and the number of characters in a word. We conduct experiments on real images of handwritten texts taken from the IAM handwriting database and compare the performance of the presented method against an existing text segmentation algorithm that uses dynamic programming and achieve significant performance improvement. Digitization of document images using OCR based systems is adversely affected if the image of the document contains distortion (warping). Often, costly and precisely calibrated special hardware such as stereo cameras, laser scanners, etc. are used to infer the 3D model of the distorted image which is used to remove the distortion. Recent methods focus on creating a 3D shape model based on 2D distortion informa- tion obtained from the document image. The performance of these methods is highly dependent on estimating an accurate 2D distortion grid. These methods often affix the 2D distortion grid lines to the text line, and as such, may suffer in the presence of unreliable textual cues due to preprocessing steps such as binarization. In the domain of printed document images, the white space between the text lines carries as much information about the 2D distortion as the text lines themselves. Based on this intuitive idea, in the second part of our work we build a 2D distortion grid from white space lines, which can be used to rectify a printed document image by a dewarping algorithm. We compare our presented method against a state-of-the-art 2D distortion grid construction method and obtain better results. We also present qualitative and quantitative evaluations for the presented method. Collation of texts and images is an indispensable but labor-intensive step in the study of print materials. It is an often used methodology by textual scholars when the manuscript of the text does not exist. Although various methods and machines have been designed to assist in this labor, it still remains an expensive and time- consuming process, often requiring travel to distant repositories for the painstaking visual examination of multiple original copies. Efforts to digitize collation have so far depended on first transcribing the texts to be compared, thus introducing into the process more labor and expense, and also more potential error. Digital collation will instead automate the first stages of collation directly from the document images of the original texts, thereby speeding the process of comparison. We describe such a novel framework for digital collation in the third part of this work and provide qualitative results

    Research on Calligraphy Evaluation Technology Based on Deep Learning

    Get PDF
    Today, when computer-assisted instruction (CAI) is booming, related research in the field of calligraphy education still hasn’t much progress. This main research for the calligraphy beginners to evaluate their works anytime and anywhere. Author uses the literature research and interview to understand the common writing problems of beginners. Then conducts discussion on these problems, design of solutions, research on algorithms, and experimental verification. Based on the ResNet-50 model, through WeChat applet implements for beginners. The main research contents are as follows: (1) In order to achieve good results in calligraphy judgment, this article uses the ResNet-50 model to judge calligraphy. First, adjust the area of the handwritten calligraphy image as the input of the network to a small block suitable for the network. While training the network, adjust the learning rate, the number of image layers and the number of training samples to achieve the optimal. The research results show that ResNet has certain practicality and reference value in the field of calligraphy judgment. Regarding the possible over-fitting problem, this article proposes to improve the accuracy of the judgment by collecting more data and optimizing the data washing process. (2) Combining the rise of WeChat applets, in view of the current WeChat applet learning platform development process and the problem of fewer functional modules, this paper uses cloud development functions to develop a calligraphy learning platform based on WeChat applets. While simplifying the development process, it ensures that the functional modules of the platform meet the needs of teachers and beginners, it has certain practicality and commercial value. After the development of the calligraphy learning applet is completed, it will be submitted for official

    Sub-sampling Approach for Unconstrained Arabic Scene Text Analysis by Implicit Segmentation based Deep Learning Classifier

    Get PDF
    The text extraction from the natural scene image is still a cumbersome task to perform. This paper presents a novel contribution and suggests the solution for cursive scene text analysis notably recognition of Arabic scene text appeared in the unconstrained environment. The hierarchical sub-sampling technique is adapted to investigate the potential through sub-sampling the window size of the given scene text sample. The deep learning architecture is presented by considering the complexity of the Arabic script. The conducted experiments present 96.81% accuracy at the character level. The comparison of the Arabic scene text with handwritten and printed data is outlined as well

    ImageNet Large Scale Visual Recognition Challenge

    Get PDF
    The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL VOC (per-category comparisons in Table 3, distribution of localization difficulty in Fig 16), a list of queries used for obtaining object detection images (Appendix C), and some additional reference

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

    Multimedia information technology and the annotation of video

    Get PDF
    The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning
    • …
    corecore