Search CORE

4 research outputs found

Automatic Out-of-Language Detection Based on Confidence Measures Derived fromLVCSR Word and Phone Lattices

Author: Motlicek Petr
Publication venue
Publication date: 11/02/2010
Field of study

Confidence Measures (CMs) estimated from Large Vocabulary Continuous Speech Recognition (LVCSR) outputs are commonly used metrics to detect incorrectly recognized words. In this paper, we propose to exploit CMs derived from frame-based word and phone posteriors to detect speech segments containing pronunciations from non-target (alien) languages. The LVCSR system used is built for English, which is the target language, with medium-size recognition vocabulary (5k words). The efficiency of detection is tested on a set comprising speech from three different languages (English, German, Czech). Results achieved indicate that employment of specific temporal context (integrated in the word or phone level) significantly increases the detection accuracies. Furthermore, we show that combination of several CMs can also improve the efficiency of detection

Infoscience - École polytechnique fédérale de Lausanne

Improving ASR error detection with non-decoder based features

Author: Isabel Trancoso
Thomas Pellegrini
Publication venue
Publication date: 01/01/2010
Field of study

Abstract This study reports error detection experiments in large vocabulary automatic speech recognition (ASR) systems, by using statistical classifiers. We explored new features gathered from other knowledge sources than the decoder itself: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. Experiments were conducted on a European Portuguese broadcast news corpus. The combination of baseline decoder-based features and two of these additional features led to significant improvements, from 13.87% to 12.16% classification error rate (CER) with a maximum entropy model, and from 14.01% to 12.39% CER with linear-chain conditional random fields, comparing to a baseline using only decoder-based features

CiteSeerX

Maximum Entropy Confidence Estimation for Speech Recognition

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref