We study a category of robust speech recognition problem in which mismatches exist between training and testing conditions, and no accurate knowledge of the mismatch mechanism is available. The only available information is the test data along with a set of pretrained Gaussian mixture continuous density hidden Markov models (CDHMMs). We investigate the problem from the viewpoint of Bayesian prediction. A simple prior distribution, namely constrained uniform distribution, is adopted to characterize the uncertainty of the mean vectors of the CDHMMs. Two methods, namely a model compensation technique based on Bayesian predictive density and a robust decision strategy called Viterbi Bayesian predictive classification are studied. The proposed methods are compared with the conventional Viterbi decoding algorithm in speaker-independent recognition experiments on isolated digits and TI connected digit strings (TIDTGITS), where the mismatches between training and testing conditions are caused by: (1) additive Gaussian white noise, (2) each of 25 types of actual additive ambient noises, and (3) gender difference. The experimental results show that the adopted prior distribution and the proposed techniques help to improve the performance robustness under the examined mismatch conditions.published_or_final_versio

Hirose, K

Huo, Q

Jiang, H

IEEE Transactions on Speech and Audio Processing

... recognition problem in which mismatches exist between training and testing conditions, and no accurate knowledge of the mismatch mechanism is available. The only available information is the test data along with a set of pretrained Gaussian mixture continuous density hidden Markov models (CDHMM’s). We investigate the problem from the viewpoint of Bayesian predic-tion. A simple prior distribution, namely constrained uniform distribution, is adopted to characterize the uncertainty of the mean vectors of the CDHMM’s. Two methods, namely a model compensation technique based on Bayesian predictive density and a robust decision strategy called Viterbi Bayesian predictive classi-fication are studied. The proposed methods are compared with the conventional Viterbi decoding algorithm in speaker-independent recognition experiments on isolated digits and TI connected digit strings (TIDIGITS), where the mismatches between training and testing conditions are caused by: 1) additive Gaussian white noise, 2) each of 25 types of actual additive ambient noises, and 3) gender difference. The experimental results show that the adopted prior distribution and the proposed techniques help to improve the performance robustness under the examined mismatch conditions

Hui Jiang

Keikichi Hirose

Qiang Huo

CiteSeerX

Robust speech recognition based on a Bayesian prediction approach

HKU Scholars Hub

Abstract—In this paper, we study a category of robust speech recognition problem in which mismatches exist between training and testing conditions, and no accurate knowledge of the mis-match mechanism is available. The only available information is the test data along with a set of pretrained Gaussian mixture continuous density hidden Markov models (CDHMM’s). We investigate the problem from the viewpoint of Bayesian predic-tion. A simple prior distribution, namely constrained uniform distribution, is adopted to characterize the uncertainty of the mean vectors of the CDHMM’s. Two methods, namely a model compensation technique based on Bayesian predictive density and a robust decision strategy called Viterbi Bayesian predictive classi-fication are studied. The proposed methods are compared with the conventional Viterbi decoding algorithm in speaker-independent recognition experiments on isolated digits and TI connected digit strings (TIDIGITS), where the mismatches between training and testing conditions are caused by: 1) additive Gaussian white noise, 2) each of 25 types of actual additive ambient noises, and 3) gender difference. The experimental results show that the adopted prior distribution and the proposed techniques help to improve the performance robustness under the examined mismatch conditions. Index Terms—Bayesian predictive classification, minimax de-cision, plug-in maximum a posteriori decision, predictive density, Viterbi Bayesian predictive classification. I

http://140.112.21.8/previous_version/ASR-Bayesian(SAP-1999).pdf

Robust speech recognition based on a Bayesian prediction approach

Abstract

Similar works

Full text

Available Versions

CiteSeerX

HKU Scholars Hub

CiteSeerX