1,624 research outputs found
Key-value information extraction from full handwritten pages
We propose a Transformer-based approach for information extraction from
digitized handwritten documents. Our approach combines, in a single model, the
different steps that were so far performed by separate models: feature
extraction, handwriting recognition and named entity recognition. We compare
this integrated approach with traditional two-stage methods that perform
handwriting recognition before named entity recognition, and present results at
different levels: line, paragraph, and page. Our experiments show that
attention-based models are especially interesting when applied on full pages,
as they do not require any prior segmentation step. Finally, we show that they
are able to learn from key-value annotations: a list of important words with
their corresponding named entities. We compare our models to state-of-the-art
methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform
previous performances on all three datasets
HMM word graph based keyword spotting in handwritten document images
[EN] Line-level keyword spotting (KWS) is presented on the basis of frame-level word posterior
probabilities. These posteriors are obtained using word graphs derived from the recogni-
tion process of a full-fledged handwritten text recognizer based on hidden Markov models
and N-gram language models. This approach has several advantages. First, since it uses
a holistic, segmentation-free technology, it does not require any kind of word or charac-
ter segmentation. Second, the use of language models allows the context of each spotted
word to be taken into account, thereby considerably increasing KWS accuracy. And third,
the proposed KWS scores are based on true posterior probabilities, taking into account
all (or most) possible word segmentations of the input image. These scores are properly
bounded and normalized. This mathematically clean formulation lends itself to smooth,
threshold-based keyword queries which, in turn, permit comfortable trade-offs between
search precision and recall. Experiments are carried out on several historic collections of
handwritten text images, as well as a well-known data set of modern English handwrit-
ten text. According to the empirical results, the proposed approach achieves KWS results
comparable to those obtained with the recently-introduced "BLSTM neural networks KWS"
approach and clearly outperform the popular, state-of-the-art "Filler HMM" KWS method.
Overall, the results clearly support all the above-claimed advantages of the proposed ap-
proach.This work has been partially supported by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon 2020 programme, grant Ref. 674943).Toselli, AH.; Vidal, E.; Romero, V.; Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences. 370:497-518. https://doi.org/10.1016/j.ins.2016.07.063S49751837
GR-RNN:Global-Context Residual Recurrent Neural Networks for Writer Identification
This paper presents an end-to-end neural network system to identify writers
through handwritten word images, which jointly integrates global-context
information and a sequence of local fragment-based features. The global-context
information is extracted from the tail of the neural network by a global
average pooling step. The sequence of local and fragment-based features is
extracted from a low-level deep feature map which contains subtle information
about the handwriting style. The spatial relationship between the sequence of
fragments is modeled by the recurrent neural network (RNN) to strengthen the
discriminative ability of the local fragment features. We leverage the
complementary information between the global-context and local fragments,
resulting in the proposed global-context residual recurrent neural network
(GR-RNN) method. The proposed method is evaluated on four public data sets and
experimental results demonstrate that it can provide state-of-the-art
performance. In addition, the neural networks trained on gray-scale images
provide better results than neural networks trained on binarized and contour
images, indicating that texture information plays an important role for writer
identification.
The source code will be available:
\url{https://github.com/shengfly/writer-identification}.Comment: To appear: Pattern Recognitio
- …