20 research outputs found

    Utterance Augmentation for Speaker Recognition

    Get PDF
    The speaker recognition problem is to automatically recognize a person from their voice. The training of a speaker recognition model typically requires a very large training corpus, e.g., multiple voice samples from a very large number of individuals. In the diverse domains of application of speaker recognition, it is often impractical to obtain a training corpus of the requisite size. This disclosure describes techniques that augment utterances, e.g., by cutting, splitting, shuffling, etc., such that the need for collections of raw voice samples from individuals is substantially reduced. In effect, the original model works better on the augmented utterances on the target domain

    Combination strategies for a factor analysis phone-conditioned speaker verification system

    Get PDF
    This work aims to take advantage of recent developments in joint factor analysis (JFA) in the context of a phonetically conditioned GMM speaker verification system. Previous work has shown performance advantages through phonetic conditioning, but this has not been shown to date with the JFA framework. Our focus is particularly on strategies for combining the phone-conditioned systems. We show that the classic fusion of the scores is suboptimal when using multiple GMM systems. We investigate several combination strategies in the model space, and demonstrate improvement over score-level combination as well as over a non-phonetic baseline system. This work was conducted during the 2008 CLSP Workshop at Johns Hopkins University

    Feature Warping for Robust Speaker Verification

    Get PDF
    We propose a novel feature mapping approach that is robust to channel mismatch, additive noise and to some extent, non-linear effects attributed to handset transducers. These adverse effects can distort the short-term distribution of the speech features. Some methods have addressed this issue by conditioning the variance of the distribution, but not to the extent of conforming the speech statistics to a target distribution. The proposed target mapping method warps the distribution of a cepstral feature stream to a standardised distribution over a specified time interval. We evaluate a number of the enhancement methods for speaker verification, and compare them against a Gaussian target mapping implementation. Results indicate improvements of the warping technique over a number of methods such as Cepstral Mean Subtraction (CMS), modulation spectrum processing, and short-term windowed CMS and variance normalisation

    Unsupervised Evaluation of Speaker Verification Systems

    No full text
    A method for blind estimation of DET curves for speaker verification systems is proposed. Verification error probabilities are estimated on a database where speaker identities are unknown. The database must provide a set of impostor-only tests as well as a set of mixed impostor and target tests. This method is tested on 9 speaker verification systems that were scored on the NIST 2000 database. Good DET estimates are obtained for systems with low error rates, while poorer estimates are obtained for systems with high error rates

    Within-session variability modelling for factor analysis speaker verification

    Get PDF
    This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art
    corecore