3 research outputs found

    Fast speaker adaptation using non-negative matrix factorization

    Full text link

    Fast speaker adaptation using non-negative matrix factorization

    No full text
    This paper describes a new method for fast speaker adaptation in large vocabulary recognition systems. As in most HMM-based recognizers, the observation densities are modeled as a weighted sum of Gaussian densities. Instead of adapting the means of the Gaussian densities, which is typically done, the weights for the Gaussian densities in the states are adapted. By applying non-negative matrix factorization (NMF) in the proposed method, very fast adaptation was achieved. Experiments on the Wall Street Journal benchmark recognition task show relative improvements between 5% and 15%, while the adaptation converges within 0.2 seconds. Analysis of the latent speakers found by NMF learns that these latent speakers reflect the gender of the speaker most prominently, even when vocal tract length normalization is used, and that they reflect the speaker's age more clearly than the speaker's regional influences or dialect. ©2008 IEEE.Duchateau J., Leroy T., Demuynck K., Van hamme H., ''Fast speaker adaptation using non-negative matrix factorization'', Proceedings IEEE international conference on acoustics, speech, and signal processing - ICASSP’2008, pp. 4269-4272, March 30 - April 4, 2008, Las Vegas, Nevada, USA.status: publishe
    corecore