436 research outputs found
Disentangling Writer and Character Styles for Handwriting Generation
Training machines to synthesize diverse handwritings is an intriguing task.
Recently, RNN-based methods have been proposed to generate stylized online
Chinese characters. However, these methods mainly focus on capturing a person's
overall writing style, neglecting subtle style inconsistencies between
characters written by the same person. For example, while a person's
handwriting typically exhibits general uniformity (e.g., glyph slant and aspect
ratios), there are still small style variations in finer details (e.g., stroke
length and curvature) of characters. In light of this, we propose to
disentangle the style representations at both writer and character levels from
individual handwritings to synthesize realistic stylized online handwritten
characters. Specifically, we present the style-disentangled Transformer (SDT),
which employs two complementary contrastive objectives to extract the style
commonalities of reference samples and capture the detailed style patterns of
each sample, respectively. Extensive experiments on various language scripts
demonstrate the effectiveness of SDT. Notably, our empirical findings reveal
that the two learned style representations provide information at different
frequency magnitudes, underscoring the importance of separate style extraction.
Our source code is public at: https://github.com/dailenson/SDT.Comment: accepted by CVPR 2023. Source code: https://github.com/dailenson/SD
HWD: A Novel Evaluation Score for Styled Handwritten Text Generation
Styled Handwritten Text Generation (Styled HTG) is an important task in
document analysis, aiming to generate text images with the handwriting of given
reference images. In recent years, there has been significant progress in the
development of deep learning models for tackling this task. Being able to
measure the performance of HTG models via a meaningful and representative
criterion is key for fostering the development of this research topic. However,
despite the current adoption of scores for natural image generation evaluation,
assessing the quality of generated handwriting remains challenging. In light of
this, we devise the Handwriting Distance (HWD), tailored for HTG evaluation. In
particular, it works in the feature space of a network specifically trained to
extract handwriting style features from the variable-lenght input images and
exploits a perceptual distance to compare the subtle geometric features of
handwriting. Through extensive experimental evaluation on different word-level
and line-level datasets of handwritten text images, we demonstrate the
suitability of the proposed HWD as a score for Styled HTG. The pretrained model
used as backbone will be released to ease the adoption of the score, aiming to
provide a valuable tool for evaluating HTG models and thus contributing to
advancing this important research area.Comment: Accepted at BMVC202
Machine Learning for handwriting text recognition in historical documents
Olmos
ABSTRACT
In this thesis, we focus on the handwriting text recognition task over historical
documents that are difficult to read for any person that is not an expert in ancient
languages and writing style.
We aim to take advantage and improve the neural networks architectures and
techniques that other authors are proposing for handwriting text recognition in
modern handwritten documents. These models perform this task very precisely
when a large amount of data is available. However, the low availability of labeled
data is a widespread problem in historical documents. The type of writing is
singular, and it is pretty expensive to hire an expert to transcribe a large number
of pages.
After investigating and analyzing the state-of-the-art, we propose the efficient
application of methods such as transfer learning and data augmentation. We also
contribute an algorithm for purging mislabeled samples that affect the learning of
models. Finally, we develop a variational auto encoder method for generating
synthetic samples of handwritten text images for data augmentation.
Experiments are performed on various historical handwritten text databases to
validate the performance of the proposed algorithms. The various included
analyses focus on the evolution of the character and word error rate (CER and
WER) as we increase the training dataset.
One of the most important results is the participation in a contest for transcription
of historical handwritten text. The organizers provided us with a dataset of
documents to train the model, then just a few labeled pages of 5 new documents
were handled to adjust the solution further. Finally, the transcription of nonlabeled
images was requested to evaluate the algorithm. Our method raked
second in this contest
- …