2 research outputs found
A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems
Transcription of broadcast news is an interesting and challenging application
for large-vocabulary continuous speech recognition (LVCSR). We present in
detail the structure of a manually segmented and annotated corpus including
over 160 hours of German broadcast news, and propose it as an evaluation
framework of LVCSR systems. We show our own experimental results on the corpus,
achieved with a state-of-the-art LVCSR decoder, measuring the effect of
different feature sets and decoding parameters, and thereby demonstrate that
real-time decoding of our test set is feasible on a desktop PC at 9.2% word
error rate.Comment: submitted to INTERSPEECH 2010 on May 3, 201
Ducoder - The Duisburg University Lvcsr Stackdecoder
With this paper, we present the DUcoder, the LVCSR decoder developed at Duisburg University. The decoder performs the Viterbi search for the most probable word sequence in recognition systems that make use of HMMs and backo N-gram language models. In principle, the decoding strategy is similar to the one of the so-called stackdecoders. During the development of the decoder, emphasis has been laid upon innovations for rapidly speeding up decoding by carefully performing approximations. Besides a brief presentation of the decoder's overall design, this paper points out the crucial issues with respect to speed and recognition performance. Evaluations are carried out on a German LVCSR system with a vocabulary of 100; 000 words, wordinternal triphones and a trigram language model. Closeto -real-time performance is achieved with 12% additional error while a decoder conguration which runs in around 40 times real-time causes no search error on the evaluation set. 1. INTRODUCTION The realiz..