4 research outputs found
HMM word graph based keyword spotting in handwritten document images
[EN] Line-level keyword spotting (KWS) is presented on the basis of frame-level word posterior
probabilities. These posteriors are obtained using word graphs derived from the recogni-
tion process of a full-fledged handwritten text recognizer based on hidden Markov models
and N-gram language models. This approach has several advantages. First, since it uses
a holistic, segmentation-free technology, it does not require any kind of word or charac-
ter segmentation. Second, the use of language models allows the context of each spotted
word to be taken into account, thereby considerably increasing KWS accuracy. And third,
the proposed KWS scores are based on true posterior probabilities, taking into account
all (or most) possible word segmentations of the input image. These scores are properly
bounded and normalized. This mathematically clean formulation lends itself to smooth,
threshold-based keyword queries which, in turn, permit comfortable trade-offs between
search precision and recall. Experiments are carried out on several historic collections of
handwritten text images, as well as a well-known data set of modern English handwrit-
ten text. According to the empirical results, the proposed approach achieves KWS results
comparable to those obtained with the recently-introduced "BLSTM neural networks KWS"
approach and clearly outperform the popular, state-of-the-art "Filler HMM" KWS method.
Overall, the results clearly support all the above-claimed advantages of the proposed ap-
proach.This work has been partially supported by the Generalitat Valenciana under the Prometeo/2009/014 project grant ALMA-MATER, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon 2020 programme, grant Ref. 674943).Toselli, AH.; Vidal, E.; Romero, V.; Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences. 370:497-518. https://doi.org/10.1016/j.ins.2016.07.063S49751837
Out-of-vocabulary word modelling and rejection for keyword spotting
This paper presents a combination of out-of-vocabulary
(OOV) word modeling and rejection techniques in an attempt to accept
utterances embedding a keyword and reject utterances with nonkeywords.
The goal of this research is to develop a robust, task-independent
Spanish keyword spotter and to develop a method for optimizing
confidence thresholds for a particular context. To model OOV words,
we employed both word and sub-word units as fillers, combined with
n-gram language models. We also introduce a methodology for optimizing
confidence thresholds to control the tradeoffs between acceptance,
confirmation, and rejection of utterances. Our experiments are based on
a Mexican Spanish auto-attendant system using the SpeechWorks recognizer
release 6.5 Second Edition, in which we achieved a reduction
in error of 8.9% as compared to the baseline system. Most of the error
reduction is attributed to better keyword detection in utterances that
contain both keywords and OOV words.Peer Reviewe
Out-of-vocabulary word modelling and rejection for keyword spotting
This paper presents a combination of out-of-vocabulary
(OOV) word modeling and rejection techniques in an attempt to accept
utterances embedding a keyword and reject utterances with nonkeywords.
The goal of this research is to develop a robust, task-independent
Spanish keyword spotter and to develop a method for optimizing
confidence thresholds for a particular context. To model OOV words,
we employed both word and sub-word units as fillers, combined with
n-gram language models. We also introduce a methodology for optimizing
confidence thresholds to control the tradeoffs between acceptance,
confirmation, and rejection of utterances. Our experiments are based on
a Mexican Spanish auto-attendant system using the SpeechWorks recognizer
release 6.5 Second Edition, in which we achieved a reduction
in error of 8.9% as compared to the baseline system. Most of the error
reduction is attributed to better keyword detection in utterances that
contain both keywords and OOV words.Peer Reviewe