4 research outputs found
Term-Dependent Confidence for Out-of-Vocabulary Term Detection
Within a spoken term detection (STD) system, the decision maker plays an important role in retrieving reliable detections. Most of the state-of-the-art STD systems make decisions based on a confidence measure that is term-independent, which poses a serious problem for out-of-vocabulary (OOV) term detection. In this paper, we study a term-dependent confidence measure based on confidence normalisation and discriminative modelling, particularly focusing on its remarkable effectiveness for detecting OOV terms. Experimental results indicate that the term-dependent confidence provides much more significant improvement for OOV terms than terms in-vocabulary
Evolutionary discriminative confidence estimation for spoken term detection
The final publication is available at Springer via http://dx.doi.org/10.1007/s11042-011-0913-zSpoken term detection (STD) is the task of searching for occurrences
of spoken terms in audio archives. It relies on robust confidence estimation
to make a hit/false alarm (FA) decision. In order to optimize the decision
in terms of the STD evaluation metric, the confidence has to be discriminative.
Multi-layer perceptrons (MLPs) and support vector machines (SVMs) exhibit
good performance in producing discriminative confidence; however they are
severely limited by the continuous objective functions, and are therefore less
capable of dealing with complex decision tasks. This leads to a substantial
performance reduction when measuring detection of out-of-vocabulary (OOV)
terms, where the high diversity in term properties usually leads to a complicated
decision boundary.
In this paper we present a new discriminative confidence estimation approach
based on evolutionary discriminant analysis (EDA). Unlike MLPs and
SVMs, EDA uses the classification error as its objective function, resulting
in a model optimized towards the evaluation metric. In addition, EDA combines
heterogeneous projection functions and classification strategies in decision
making, leading to a highly flexible classifier that is capable of dealing
with complex decision tasks. Finally, the evolutionary strategy of EDA reduces the risk of local minima. We tested the EDA-based confidence with a
state-of-the-art phoneme-based STD system on an English meeting domain
corpus, which employs a phoneme speech recognition system to produce lattices
within which the phoneme sequences corresponding to the enquiry terms
are searched. The test corpora comprise 11 hours of speech data recorded with
individual head-mounted microphones from 30 meetings carried out at several
institutes including ICSI; NIST; ISL; LDC; the Virginia Polytechnic Institute
and State University; and the University of Edinburgh. The experimental results
demonstrate that EDA considerably outperforms MLPs and SVMs on
both classification and confidence measurement in STD, and the advantage
is found to be more significant on OOV terms than on in-vocabulary (INV)
terms. In terms of classification performance, EDA achieved an equal error
rate (EER) of 11% on OOV terms, compared to 34% and 31% with MLPs and
SVMs respectively; for INV terms, an EER of 15% was obtained with EDA
compared to 17% obtained with MLPs and SVMs. In terms of STD performance
for OOV terms, EDA presented a significant relative improvement of
1.4% and 2.5% in terms of average term-weighted value (ATWV) over MLPs
and SVMs respectively.This work was partially supported by the French Ministry of Industry
(Innovative Web call) under contract 09.2.93.0966, âCollaborative Annotation for Video
Accessibilityâ (ACAV) and by âThe Adaptable Ambient Living Assistantâ (ALIAS) project
funded through the joint national Ambient Assisted Living (AAL) programme