550 research outputs found
Chinese Character Recognition with Radical-Structured Stroke Trees
The flourishing blossom of deep learning has witnessed the rapid development
of Chinese character recognition. However, it remains a great challenge that
the characters for testing may have different distributions from those of the
training dataset. Existing methods based on a single-level representation
(character-level, radical-level, or stroke-level) may be either too sensitive
to distribution changes (e.g., induced by blurring, occlusion, and zero-shot
problems) or too tolerant to one-to-many ambiguities. In this paper, we
represent each Chinese character as a stroke tree, which is organized according
to its radical structures, to fully exploit the merits of both radical and
stroke levels in a decent way. We propose a two-stage decomposition framework,
where a Feature-to-Radical Decoder perceives radical structures and radical
regions, and a Radical-to-Stroke Decoder further predicts the stroke sequences
according to the features of radical regions. The generated radical structures
and stroke sequences are encoded as a Radical-Structured Stroke Tree (RSST),
which is fed to a Tree-to-Character Translator based on the proposed Weighted
Edit Distance to match the closest candidate character in the RSST lexicon. Our
extensive experimental results demonstrate that the proposed method outperforms
the state-of-the-art single-level methods by increasing margins as the
distribution difference becomes more severe in the blurring, occlusion, and
zero-shot scenarios, which indeed validates the robustness of the proposed
method
A Study of Techniques and Challenges in Text Recognition Systems
The core system for Natural Language Processing (NLP) and digitalization is Text Recognition. These systems are critical in bridging the gaps in digitization produced by non-editable documents, as well as contributing to finance, health care, machine translation, digital libraries, and a variety of other fields. In addition, as a result of the pandemic, the amount of digital information in the education sector has increased, necessitating the deployment of text recognition systems to deal with it. Text Recognition systems worked on three different categories of text: (a) Machine Printed, (b) Offline Handwritten, and (c) Online Handwritten Texts. The major goal of this research is to examine the process of typewritten text recognition systems. The availability of historical documents and other traditional materials in many types of texts is another major challenge for convergence. Despite the fact that this research examines a variety of languages, the Gurmukhi language receives the most focus. This paper shows an analysis of all prior text recognition algorithms for the Gurmukhi language. In addition, work on degraded texts in various languages is evaluated based on accuracy and F-measure
Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning
Scene text recognition has been studied for decades due to its broad
applications. However, despite Chinese characters possessing different
characteristics from Latin characters, such as complex inner structures and
large categories, few methods have been proposed for Chinese Text Recognition
(CTR). Particularly, the characteristic of large categories poses challenges in
dealing with zero-shot and few-shot Chinese characters. In this paper, inspired
by the way humans recognize Chinese texts, we propose a two-stage framework for
CTR. Firstly, we pre-train a CLIP-like model through aligning printed character
images and Ideographic Description Sequences (IDS). This pre-training stage
simulates humans recognizing Chinese characters and obtains the canonical
representation of each character. Subsequently, the learned representations are
employed to supervise the CTR model, such that traditional single-character
recognition can be improved to text-line recognition through image-IDS
matching. To evaluate the effectiveness of the proposed method, we conduct
extensive experiments on both Chinese character recognition (CCR) and CTR. The
experimental results demonstrate that the proposed method performs best in CCR
and outperforms previous methods in most scenarios of the CTR benchmark. It is
worth noting that the proposed method can recognize zero-shot Chinese
characters in text images without fine-tuning, whereas previous methods require
fine-tuning when new classes appear. The code is available at
https://github.com/FudanVI/FudanOCR/tree/main/image-ids-CTR.Comment: ICCV 202
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration
Stroke extraction of Chinese characters plays an important role in the field
of character recognition and generation. The most existing character stroke
extraction methods focus on image morphological features. These methods usually
lead to errors of cross strokes extraction and stroke matching due to rarely
using stroke semantics and prior information. In this paper, we propose a deep
learning-based character stroke extraction method that takes semantic features
and prior information of strokes into consideration. This method consists of
three parts: image registration-based stroke registration that establishes the
rough registration of the reference strokes and the target as prior
information; image semantic segmentation-based stroke segmentation that
preliminarily separates target strokes into seven categories; and
high-precision extraction of single strokes. In the stroke registration, we
propose a structure deformable image registration network to achieve
structure-deformable transformation while maintaining the stable morphology of
single strokes for character images with complex structures. In order to verify
the effectiveness of the method, we construct two datasets respectively for
calligraphy characters and regular handwriting characters. The experimental
results show that our method strongly outperforms the baselines. Code is
available at https://github.com/MengLi-l1/StrokeExtraction.Comment: 10 pages, 8 figures, published to AAAI-23 (oral
Handwritten Amharic Character Recognition Using a Convolutional Neural Network
Amharic is the official language of the Federal Democratic Republic of
Ethiopia. There are lots of historic Amharic and Ethiopic handwritten documents
addressing various relevant issues including governance, science, religious,
social rules, cultures and art works which are very reach indigenous knowledge.
The Amharic language has its own alphabet derived from Ge'ez which is currently
the liturgical language in Ethiopia. Handwritten character recognition for non
Latin scripts like Amharic is not addressed especially using the advantages of
the state of the art techniques. This research work designs for the first time
a model for Amharic handwritten character recognition using a convolutional
neural network. The dataset was organized from collected sample handwritten
documents and data augmentation was applied for machine learning. The model was
further enhanced using multi-task learning from the relationships of the
characters. Promising results are observed from the later model which can
further be applied to word prediction.Comment: ECDA2019 Conference Oral Presentatio
- …