Abstract—The discriminative n-gram modeling approach re-ranks the-best hypotheses generated during decoding and can effectively improve the performance of large-vocabulary continuous speech recognition (LVCSR). This work recasts the discriminative n-gram model as a pseudo-conventional n-gram model. The recast enables the power of discriminative n-gram modeling to be conveniently incorporated in a single-pass decoding procedure. We also propose an efficient method to apply the pseudo model to rescore the recognition lattices generated during decoding. Experimental results show that when the test data is similar in nature to the training data, applying the pseudo model to rescore the recognition lattices can achieve better performance and efficiency, when compared with discriminative-best re-ranking (i.e., re-ranking the-best hypotheses with the discriminative n-gram model). We demonstrate that in this case, applying the pseudo model in decoding can be even more advantageous. However, when the test data is different in nature from the training data, discriminative-best re-ranking may offer greater benefits than pseudo-model based lattice rescoring or decoding. Based on the pseudo-conventional n-gram representation, we also investigate the feasibility of combining discriminative n-gram modeling with other recognition post-processes and demonstrate that cumulative performance improvements can be achieved. Index Terms—Discriminative n-gram modeling, large-vocabulary continuous speech recognition (LVCSR)
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.