For voice controlled car navigation systems, multilinguality is a big challenge. The goals are clear. Users drive to other countries and need to enter foreign city names, at the same time it is likely that they will keep interacting in their native language for other commands. One important aspect is that the utterances the users produce differ from native speaker utterances, they have a non-native accent. The motivation for our work is that people hear better at low frequencies and know that low frequencies are more important for producing understandable utterances in the foreign language. Therefore they first aim to copy the low frequency behav-ior of the foreign language. Additionally, changes in mid to high frequencies are caused by little tongue movements. These subtle changes are hard to control for non-native speakers. Together both reasons cause the effect that non-native speech differs stronger from native speech for mid-range frequencies. Thus we analyze if speech recognition of non-native speakers can be improved by lowering the influence of mid to high frequencies. We achieve this through increas-ing some variances of the Gaussians. This leads to an reduced influence of differ-ences in the corresponding frequency band on the likelihood output of a Gaussian. This way we can model the selective mismatch between native training data and non-native test data.

Similar works

This paper was published in CiteSeerX.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.