2,495 research outputs found
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Writer adaptation for offline text recognition: An exploration of neural network-based methods
Handwriting recognition has seen significant success with the use of deep
learning. However, a persistent shortcoming of neural networks is that they are
not well-equipped to deal with shifting data distributions. In the field of
handwritten text recognition (HTR), this shows itself in poor recognition
accuracy for writers that are not similar to those seen during training. An
ideal HTR model should be adaptive to new writing styles in order to handle the
vast amount of possible writing styles. In this paper, we explore how HTR
models can be made writer adaptive by using only a handful of examples from a
new writer (e.g., 16 examples) for adaptation. Two HTR architectures are used
as base models, using a ResNet backbone along with either an LSTM or
Transformer sequence decoder. Using these base models, two methods are
considered to make them writer adaptive: 1) model-agnostic meta-learning
(MAML), an algorithm commonly used for tasks such as few-shot classification,
and 2) writer codes, an idea originating from automatic speech recognition.
Results show that an HTR-specific version of MAML known as MetaHTR improves
performance compared to the baseline with a 1.4 to 2.0 improvement in word
error rate (WER). The improvement due to writer adaptation is between 0.2 and
0.7 WER, where a deeper model seems to lend itself better to adaptation using
MetaHTR than a shallower model. However, applying MetaHTR to larger HTR models
or sentence-level HTR may become prohibitive due to its high computational and
memory requirements. Lastly, writer codes based on learned features or Hinge
statistical features did not lead to improved recognition performance.Comment: 21 pages including appendices, 6 figures, 10 table
CSSL-RHA: Contrastive Self-Supervised Learning for Robust Handwriting Authentication
Handwriting authentication is a valuable tool used in various fields, such as
fraud prevention and cultural heritage protection. However, it remains a
challenging task due to the complex features, severe damage, and lack of
supervision. In this paper, we propose a novel Contrastive Self-Supervised
Learning framework for Robust Handwriting Authentication (CSSL-RHA) to address
these issues. It can dynamically learn complex yet important features and
accurately predict writer identities. Specifically, to remove the negative
effects of imperfections and redundancy, we design an information-theoretic
filter for pre-processing and propose a novel adaptive matching scheme to
represent images as patches of local regions dominated by more important
features. Through online optimization at inference time, the most informative
patch embeddings are identified as the "most important" elements. Furthermore,
we employ contrastive self-supervised training with a momentum-based paradigm
to learn more general statistical structures of handwritten data without
supervision. We conduct extensive experiments on five benchmark datasets and
our manually annotated dataset EN-HA, which demonstrate the superiority of our
CSSL-RHA compared to baselines. Additionally, we show that our proposed model
can still effectively achieve authentication even under abnormal circumstances,
such as data falsification and corruption.Comment: 10 pages, 4 figures, 3 tables, submitted to ACM MM 202
GR-RNN:Global-Context Residual Recurrent Neural Networks for Writer Identification
This paper presents an end-to-end neural network system to identify writers
through handwritten word images, which jointly integrates global-context
information and a sequence of local fragment-based features. The global-context
information is extracted from the tail of the neural network by a global
average pooling step. The sequence of local and fragment-based features is
extracted from a low-level deep feature map which contains subtle information
about the handwriting style. The spatial relationship between the sequence of
fragments is modeled by the recurrent neural network (RNN) to strengthen the
discriminative ability of the local fragment features. We leverage the
complementary information between the global-context and local fragments,
resulting in the proposed global-context residual recurrent neural network
(GR-RNN) method. The proposed method is evaluated on four public data sets and
experimental results demonstrate that it can provide state-of-the-art
performance. In addition, the neural networks trained on gray-scale images
provide better results than neural networks trained on binarized and contour
images, indicating that texture information plays an important role for writer
identification.
The source code will be available:
\url{https://github.com/shengfly/writer-identification}.Comment: To appear: Pattern Recognitio
- …