In this paper we present a new approach of combining stochastic language models and traditional linguistic models to enhance the performance of our spontaneous speech recognizer. We compile arbitrary large linguistic context dependencies into a category based bigram model which allows us to use a standard beam-search driven forward Viterbi algorithm for real time decoding. Since this recognizer is used in a dialog system, the information about the last system utterance is used to build dialogstep dependent language models. This setup is verified and tested on our corpus of spontaneous speech utterances collected with our dialog system. Experimental results show a significant reduction of word error rate. 1. INTRODUCTION In the last years it has been shown that the consideration of language constraints is vital for effective and efficient speech recognition. Typically, these language constraints are modeled in a so called language model which will restrict the allowed seqences of words..
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.