270 research outputs found
HMM word graph based keyword spotting in handwritten document images
[EN] Line-level keyword spotting (KWS) is presented on the basis of frame-level word posterior
probabilities. These posteriors are obtained using word graphs derived from the recogni-
tion process of a full-fledged handwritten text recognizer based on hidden Markov models
and N-gram language models. This approach has several advantages. First, since it uses
a holistic, segmentation-free technology, it does not require any kind of word or charac-
ter segmentation. Second, the use of language models allows the context of each spotted
word to be taken into account, thereby considerably increasing KWS accuracy. And third,
the proposed KWS scores are based on true posterior probabilities, taking into account
all (or most) possible word segmentations of the input image. These scores are properly
bounded and normalized. This mathematically clean formulation lends itself to smooth,
threshold-based keyword queries which, in turn, permit comfortable trade-offs between
search precision and recall. Experiments are carried out on several historic collections of
handwritten text images, as well as a well-known data set of modern English handwrit-
ten text. According to the empirical results, the proposed approach achieves KWS results
comparable to those obtained with the recently-introduced "BLSTM neural networks KWS"
approach and clearly outperform the popular, state-of-the-art "Filler HMM" KWS method.
Overall, the results clearly support all the above-claimed advantages of the proposed ap-
proach.This work has been partially supported by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon 2020 programme, grant Ref. 674943).Toselli, AH.; Vidal, E.; Romero, V.; Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences. 370:497-518. https://doi.org/10.1016/j.ins.2016.07.063S49751837
Keyword spotting in historical handwritten documents based on graph matching
In the last decades historical handwritten documents have become increasingly available in digital form. Yet, the accessibility to these documents with respect to browsing and searching remained limited as full automatic transcription is often not possible or not sufficiently accurate. This paper proposes a novel reliable approach for template-based keyword spotting in historical handwritten documents. In particular, our framework makes use of different graph representations for segmented word images and a sophisticated matching procedure. Moreover, we extend our method to a spotting ensemble. In an exhaustive experimental evaluation on four widely used benchmark datasets we show that the proposed approach is able to keep up or even outperform several state-of-the-art methods for template- and learning-based keyword spotting.The Hasler Foundation Switzerlandhttp://www.elsevier.com/locate/patcog2019-09-01hj2018Informatic
- …