Search CORE

20 research outputs found

Utterance Augmentation for Speaker Recognition

Author: Chen Zhengying
Chu Andrea
Fang Yeming
Feng Gang
Moreno Mengibar Pedro
Moreno Ignacio Lopez
Pelecanos Jason
Shi Jin
Wang Quan
Publication venue: Technical Disclosure Commons
Publication date: 18/05/2020
Field of study

The speaker recognition problem is to automatically recognize a person from their voice. The training of a speaker recognition model typically requires a very large training corpus, e.g., multiple voice samples from a very large number of individuals. In the diverse domains of application of speaker recognition, it is often impractical to obtain a training corpus of the requisite size. This disclosure describes techniques that augment utterances, e.g., by cutting, splitting, shuffling, etc., such that the need for collections of raw voice samples from individuals is substantially reduced. In effect, the original model works better on the augmented utterances on the target domain

Technical Disclosure Common

Combination strategies for a factor analysis phone-conditioned speaker verification system

Author: Kajarekar Sachin
Pelecanos Jason
Scheffer Nicholas
Vogt Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

This work aims to take advantage of recent developments in joint factor analysis (JFA) in the context of a phonetically conditioned GMM speaker verification system. Previous work has shown performance advantages through phonetic conditioning, but this has not been shown to date with the JFA framework. Our focus is particularly on strategies for combining the phone-conditioned systems. We show that the classic fusion of the scores is suboptimal when using multiple GMM systems. We investigate several combination strategies in the model space, and demonstrate improvement over score-level combination as well as over a non-phonetic baseline system. This work was conducted during the 2008 CLSP Workshop at Johns Hopkins University

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

Feature Warping for Robust Speaker Verification

Author: Pelecanos Jason
Sridharan Sridha
Publication venue: European Speech Communication Association
Publication date: 01/01/2001
Field of study

We propose a novel feature mapping approach that is robust to channel mismatch, additive noise and to some extent, non-linear effects attributed to handset transducers. These adverse effects can distort the short-term distribution of the speech features. Some methods have addressed this issue by conditioning the variance of the distribution, but not to the extent of conforming the speech statistics to a target distribution. The proposed target mapping method warps the distribution of a cepstral feature stream to a standardised distribution over a specified time interval. We evaluate a number of the enhancement methods for speaker verification, and compare them against a Gaussian target mapping implementation. Results indicate improvements of the warping technique over a number of methods such as Cepstral Mean Subtraction (CMS), modulation spectrum processing, and short-term windowed CMS and variance normalisation

CiteSeerX

Queensland University of Technology ePrints Archive

Rapid Channel Compensation for Speaker Verification in the NIST 2000 Speaker Recognition Evaluation

Author: Pelecanos Jason
Sridharan Sridha
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2001
Field of study

Queensland University of Technology ePrints Archive

Unsupervised Evaluation of Speaker Verification Systems

Author: Brummer Niko
Pelecanos Jason
Publication venue: European Speech Communication Association
Publication date: 01/01/2001
Field of study

A method for blind estimation of DET curves for speaker verification systems is proposed. Verification error probabilities are estimated on a database where speaker identities are unknown. The database must provide a set of impostor-only tests as well as a set of mixed impostor and target tests. This method is tested on 9 speaker verification systems that were scored on the NIST 2000 database. Good DET estimates are obtained for systems with low error rates, while poorer estimates are obtained for systems with high error rates

CiteSeerX

Queensland University of Technology ePrints Archive

Revisiting Carl Bildt's Impostor: Would a Speaker Verification System Foil Him?

Author: Pelecanos Jason
Sullivan Kirk
Publication venue: Springer-Verlag London Ltd
Publication date: 01/01/2001
Field of study

Queensland University of Technology ePrints Archive

Within-session variability modelling for factor analysis speaker verification

Author: Kajarekar Sachin
Pelecanos Jason
Scheffer Nicholas
Sridharan Sridha
Vogt Robert
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2009
Field of study

This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art

Queensland University of Technology ePrints Archive