1 research outputs found

    Selection Criteria for Hypothesis Driven Lexical Adaptation

    No full text
    Adapting the vocabulary of a speech recognizer to the utterance to be recognized has proven to be successful both in reducing high out-of-vocabulary as well as word error rates. This applies especially to languages that have a rapid vocabulary growth due to a large number of inflections and composita. This paper presents various adaptation methods within the Hypothesis Driven Lexical Adaptation (HDLA) framework which allow speech recognition on a virtually unlimited vocabulary. Selection criteria for the adaptation process are either based on morphological knowledge or distance measures at phoneme or grapheme level. Different methods are introduced for determining distances between phoneme pairs and for creating the large fallback lexicon the adapted vocabulary is chosen from. HDLA reduces the out-of-vocabulary-rate by 55 % for Serbo-Croatian, 35 % for German and 27 % for Turkish. The reduced out-of-vocabulary rate also decreases the word error rate by an absolute 4.1 % to 25.4 % on Serbo-Croatian broadcast news data. 2. THE SPEECH RECOGNITION ENGINE The speech recognition system used to perform all experiments for transcribing Serbo-Croatian broadcast news shows is trained on 12 hours of recorded speech of read newspaper articles and 18 hours of recorded broadcast news. It is based on 35 phones that are modeled by left-to-right HMMs. The preprocessing of the system consists of extracting an MFCC based feature vector every 10ms. The final feature vector is computed by a truncated LDA transformation of a concatenation of MFCCs and their first and second order derivatives. Vocal tract length normalization and cepstral mean subtraction are used to extenuate speaker and channel differences. The language models are trained on the hand-transcribed acoustic training data and an additional 11.8 million words of text data collected on the internet. Performance of the baseline system with an out-of-vocabulary rate of 8.7 % as well as results achieved by using HDLA are shown in table 1 below. 1
    corecore