261 research outputs found
Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition
Handwritten Text Recognition (HTR) is still a challenging problem because it
must deal with two important difficulties: the variability among writing
styles, and the scarcity of labelled data. To alleviate such problems,
synthetic data generation and data augmentation are typically used to train HTR
systems. However, training with such data produces encouraging but still
inaccurate transcriptions in real words. In this paper, we propose an
unsupervised writer adaptation approach that is able to automatically adjust a
generic handwritten word recognizer, fully trained with synthetic fonts,
towards a new incoming writer. We have experimentally validated our proposal
using five different datasets, covering several challenges (i) the document
source: modern and historic samples, which may involve paper degradation
problems; (ii) different handwriting styles: single and multiple writer
collections; and (iii) language, which involves different character
combinations. Across these challenging collections, we show that our system is
able to maintain its performance, thus, it provides a practical and generic
approach to deal with new document collections without requiring any expensive
and tedious manual annotation step.Comment: Accepted to WACV 202
Recognition of Arabic handwritten words
Recognizing Arabic handwritten words is a difficult problem due to the deformations of different writing styles. Moreover, the cursive nature of the Arabic writing makes correct segmentation of characters an almost impossible task. While there are many sub systems in an Arabic words recognition system, in this work we develop a sub system to recognize Part of Arabic Words (PAW). We try to solve this problem using three different approaches, implicit segmentation and two variants of holistic approach. While Rothacker found similar conclusions while this work is being prepared, we report the difficulty in locating characters in PAW using Scale Invariant Feature Transforms under the first approach. In the second and third approaches, we use holistic approach to recognize PAW using Support Vector Machine (SVM) and Active Shape Models (ASM). While there are few works that use SVM to recognize PAW, they use a small dataset; we use a large dataset and a different set of features. We also explain the errors SVM and ASM make and propose some remedies to these errors as future work
Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding
Retrieval of text information from natural scene images and video frames is a
challenging task due to its inherent problems like complex character shapes,
low resolution, background noise, etc. Available OCR systems often fail to
retrieve such information in scene/video frames. Keyword spotting, an
alternative way to retrieve information, performs efficient text searching in
such scenarios. However, current word spotting techniques in scene/video images
are script-specific and they are mainly developed for Latin script. This paper
presents a novel word spotting framework using dynamic shape coding for text
retrieval in natural scene image and video frames. The framework is designed to
search query keyword from multiple scripts with the help of on-the-fly
script-wise keyword generation for the corresponding script. We have used a
two-stage word spotting approach using Hidden Markov Model (HMM) to detect the
translated keyword in a given text line by identifying the script of the line.
A novel unsupervised dynamic shape coding based scheme has been used to group
similar shape characters to avoid confusion and to improve text alignment.
Next, the hypotheses locations are verified to improve retrieval performance.
To evaluate the proposed system for searching keyword from natural scene image
and video frames, we have considered two popular Indic scripts such as Bangla
(Bengali) and Devanagari along with English. Inspired by the zone-wise
recognition approach in Indic scripts[1], zone-wise text information has been
used to improve the traditional word spotting performance in Indic scripts. For
our experiment, a dataset consisting of images of different scenes and video
frames of English, Bangla and Devanagari scripts were considered. The results
obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe
Open Set Chinese Character Recognition using Multi-typed Attributes
Recognition of Off-line Chinese characters is still a challenging problem,
especially in historical documents, not only in the number of classes extremely
large in comparison to contemporary image retrieval methods, but also new
unseen classes can be expected under open learning conditions (even for CNN).
Chinese character recognition with zero or a few training samples is a
difficult problem and has not been studied yet. In this paper, we propose a new
Chinese character recognition method by multi-type attributes, which are based
on pronunciation, structure and radicals of Chinese characters, applied to
character recognition in historical books. This intermediate attribute code has
a strong advantage over the common `one-hot' class representation because it
allows for understanding complex and unseen patterns symbolically using
attributes. First, each character is represented by four groups of attribute
types to cover a wide range of character possibilities: Pinyin label, layout
structure, number of strokes, three different input methods such as Cangjie,
Zhengma and Wubi, as well as a four-corner encoding method. A convolutional
neural network (CNN) is trained to learn these attributes. Subsequently,
characters can be easily recognized by these attributes using a distance metric
and a complete lexicon that is encoded in attribute space. We evaluate the
proposed method on two open data sets: printed Chinese character recognition
for zero-shot learning, historical characters for few-shot learning and a
closed set: handwritten Chinese characters. Experimental results show a good
general classification of seen classes but also a very promising generalization
ability to unseen characters.Comment: 29 pages, submitted to Pattern Recognitio
- …