267 research outputs found
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Learning Representations from Persian Handwriting for Offline Signature Verification, a Deep Transfer Learning Approach
Offline Signature Verification (OSV) is a challenging pattern recognition
task, especially when it is expected to generalize well on the skilled
forgeries that are not available during the training. Its challenges also
include small training sample and large intra-class variations. Considering the
limitations, we suggest a novel transfer learning approach from Persian
handwriting domain to multi-language OSV domain. We train two Residual CNNs on
the source domain separately based on two different tasks of word
classification and writer identification. Since identifying a person signature
resembles identifying ones handwriting, it seems perfectly convenient to use
handwriting for the feature learning phase. The learned representation on the
more varied and plentiful handwriting dataset can compensate for the lack of
training data in the original task, i.e. OSV, without sacrificing the
generalizability. Our proposed OSV system includes two steps: learning
representation and verification of the input signature. For the first step, the
signature images are fed into the trained Residual CNNs. The output
representations are then used to train SVMs for the verification. We test our
OSV system on three different signature datasets, including MCYT (a Spanish
signature dataset), UTSig (a Persian one) and GPDS-Synthetic (an artificial
dataset). On UT-SIG, we achieved 9.80% Equal Error Rate (EER) which showed
substantial improvement over the best EER in the literature, 17.45%. Our
proposed method surpassed state-of-the-arts by 6% on GPDS-Synthetic, achieving
6.81%. On MCYT, EER of 3.98% was obtained which is comparable to the best
previously reported results
Writer adaptation for offline text recognition: An exploration of neural network-based methods
Handwriting recognition has seen significant success with the use of deep
learning. However, a persistent shortcoming of neural networks is that they are
not well-equipped to deal with shifting data distributions. In the field of
handwritten text recognition (HTR), this shows itself in poor recognition
accuracy for writers that are not similar to those seen during training. An
ideal HTR model should be adaptive to new writing styles in order to handle the
vast amount of possible writing styles. In this paper, we explore how HTR
models can be made writer adaptive by using only a handful of examples from a
new writer (e.g., 16 examples) for adaptation. Two HTR architectures are used
as base models, using a ResNet backbone along with either an LSTM or
Transformer sequence decoder. Using these base models, two methods are
considered to make them writer adaptive: 1) model-agnostic meta-learning
(MAML), an algorithm commonly used for tasks such as few-shot classification,
and 2) writer codes, an idea originating from automatic speech recognition.
Results show that an HTR-specific version of MAML known as MetaHTR improves
performance compared to the baseline with a 1.4 to 2.0 improvement in word
error rate (WER). The improvement due to writer adaptation is between 0.2 and
0.7 WER, where a deeper model seems to lend itself better to adaptation using
MetaHTR than a shallower model. However, applying MetaHTR to larger HTR models
or sentence-level HTR may become prohibitive due to its high computational and
memory requirements. Lastly, writer codes based on learned features or Hinge
statistical features did not lead to improved recognition performance.Comment: 21 pages including appendices, 6 figures, 10 table
Disentangling Writer and Character Styles for Handwriting Generation
Training machines to synthesize diverse handwritings is an intriguing task.
Recently, RNN-based methods have been proposed to generate stylized online
Chinese characters. However, these methods mainly focus on capturing a person's
overall writing style, neglecting subtle style inconsistencies between
characters written by the same person. For example, while a person's
handwriting typically exhibits general uniformity (e.g., glyph slant and aspect
ratios), there are still small style variations in finer details (e.g., stroke
length and curvature) of characters. In light of this, we propose to
disentangle the style representations at both writer and character levels from
individual handwritings to synthesize realistic stylized online handwritten
characters. Specifically, we present the style-disentangled Transformer (SDT),
which employs two complementary contrastive objectives to extract the style
commonalities of reference samples and capture the detailed style patterns of
each sample, respectively. Extensive experiments on various language scripts
demonstrate the effectiveness of SDT. Notably, our empirical findings reveal
that the two learned style representations provide information at different
frequency magnitudes, underscoring the importance of separate style extraction.
Our source code is public at: https://github.com/dailenson/SDT.Comment: accepted by CVPR 2023. Source code: https://github.com/dailenson/SD
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
- …