2 research outputs found

    A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems

    Full text link
    Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR). We present in detail the structure of a manually segmented and annotated corpus including over 160 hours of German broadcast news, and propose it as an evaluation framework of LVCSR systems. We show our own experimental results on the corpus, achieved with a state-of-the-art LVCSR decoder, measuring the effect of different feature sets and decoding parameters, and thereby demonstrate that real-time decoding of our test set is feasible on a desktop PC at 9.2% word error rate.Comment: submitted to INTERSPEECH 2010 on May 3, 201

    Ducoder - The Duisburg University Lvcsr Stackdecoder

    No full text
    With this paper, we present the DUcoder, the LVCSR decoder developed at Duisburg University. The decoder performs the Viterbi search for the most probable word sequence in recognition systems that make use of HMMs and backo N-gram language models. In principle, the decoding strategy is similar to the one of the so-called stackdecoders. During the development of the decoder, emphasis has been laid upon innovations for rapidly speeding up decoding by carefully performing approximations. Besides a brief presentation of the decoder's overall design, this paper points out the crucial issues with respect to speed and recognition performance. Evaluations are carried out on a German LVCSR system with a vocabulary of 100; 000 words, wordinternal triphones and a trigram language model. Closeto -real-time performance is achieved with 12% additional error while a decoder conguration which runs in around 40 times real-time causes no search error on the evaluation set. 1. INTRODUCTION The realiz..
    corecore