2,856 research outputs found

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Print-Scan Resilient Text Image Watermarking Based on Stroke Direction Modulation for Chinese Document Authentication

    Get PDF
    Print-scan resilient watermarking has emerged as an attractive way for document security. This paper proposes an stroke direction modulation technique for watermarking in Chinese text images. The watermark produced by the idea offers robustness to print-photocopy-scan, yet provides relatively high embedding capacity without losing the transparency. During the embedding phase, the angle of rotatable strokes are quantized to embed the bits. This requires several stages of preprocessing, including stroke generation, junction searching, rotatable stroke decision and character partition. Moreover, shuffling is applied to equalize the uneven embedding capacity. For the data detection, denoising and deskewing mechanisms are used to compensate for the distortions induced by hardcopy. Experimental results show that our technique attains high detection accuracy against distortions resulting from print-scan operations, good quality photocopies and benign attacks in accord with the future goal of soft authentication

    Open Set Chinese Character Recognition using Multi-typed Attributes

    Get PDF
    Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.Comment: 29 pages, submitted to Pattern Recognitio

    Chinese information processing

    Full text link
    A survey of the field of Chinese information processing is provided. It covers the following areas: the Chinese writing system, several popular Chinese encoding schemes and code conversions, Chinese keyboard entry methods, Chinese fonts, Chinese operating systems, basic Chinese computing techniques and applications

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
    corecore