1 research outputs found
ASR-Free Pronunciation Assessment
Most of the pronunciation assessment methods are based on local features
derived from automatic speech recognition (ASR), e.g., the Goodness of
Pronunciation (GOP) score. In this paper, we investigate an ASR-free scoring
approach that is derived from the marginal distribution of raw speech signals.
The hypothesis is that even if we have no knowledge of the language (so cannot
recognize the phones/words), we can still tell how good a pronunciation is, by
comparatively listening to some speech data from the target language. Our
analysis shows that this new scoring approach provides an interesting
correction for the phone-competition problem of GOP. Experimental results on
the ERJ dataset demonstrated that combining the ASR-free score and GOP can
achieve better performance than the GOP baseline.Comment: submitted to INTRESPEECH 202