PROBABILISTIC FEATURE TRANSFORMATION FOR CHANNEL ROBUST SPEAKER VERIFICATION

Abstract

Feature transformation plays an important role in robust speaker verification over telephone networks. This paper compares several feature transformation techniques and evaluates their verification performance and computation time under the 2000 NIST speaker recognition evaluation protocol. Techniques compared include feature mapping (FM), stochastic feature transformation (SFT), and blind stochastic feature transformation (BSFT). The paper proposes a probabilistic feature mapping (PFM) in which the mapped features depend not only on the top-1 decoded Gaussian but also on the posterior probabilities of other Gaussians in the root model. The paper also proposes speeding up the computation of PFM and BSFT parameters by considering the top few Gaussians only. Results show that PFM performs slightly better than FM and that the fast approach can reduce computation time substantially. Among the approaches investigated, the fast BSFT is found to have the highest potential for robust speaker verification over telephone networks because it can achieve good performance without any a priori knowledge of the communication channel. It was also found that fusion of the scores derived from systems using BSFT and PFM can reduce the error rate further. 1

    Similar works

    Full text

    thumbnail-image

    Available Versions