1 research outputs found
Supervector Compression Strategies to Speed up I-Vector System Development
The front-end factor analysis (FEFA), an extension of principal component
analysis (PPCA) tailored to be used with Gaussian mixture models (GMMs), is
currently the prevalent approach to extract compact utterance-level features
(i-vectors) for automatic speaker verification (ASV) systems. Little research
has been conducted comparing FEFA to the conventional PPCA applied to maximum a
posteriori (MAP) adapted GMM supervectors. We study several alternative
methods, including PPCA, factor analysis (FA), and two supervised approaches,
supervised PPCA (SPPCA) and the recently proposed probabilistic partial least
squares (PPLS), to compress MAP-adapted GMM supervectors. The resulting
i-vectors are used in ASV tasks with a probabilistic linear discriminant
analysis (PLDA) back-end. We experiment on two different datasets, on the
telephone condition of NIST SRE 2010 and on the recent VoxCeleb corpus
collected from YouTube videos containing celebrity interviews recorded in
various acoustical and technical conditions. The results suggest that, in terms
of ASV accuracy, the supervector compression approaches are on a par with FEFA.
The supervised approaches did not result in improved performance. In comparison
to FEFA, we obtained more than hundred-fold (100x) speedups in the total
variability model (TVM) training using the PPCA and FA supervector compression
approaches.Comment: To appear in Speaker Odyssey 2018: The Speaker and Language
Recognition Worksho