Most of the state-of-the-art speaker recognition systems use a compact representation of spoken utterances referred to as i-vector. Since the "standard" i-vector extraction procedure requires large memory structures and is relatively slow, new approaches have recently been proposed that are able to obtain either accurate solutions at the expense of an increase of the computational load, or fast approximate solutions, which are traded for lower memory costs. We propose a new approach particularly useful for applications that need to minimize their memory requirements. Our solution not only dramatically reduces the memory needs for i-vector extraction, but is also fast and accurate compared to recently proposed approaches. Tested on the female part of the tel-tel extended NIST 2010 evaluation trials, our approach substantially improves the performance with respect to the fastest but inaccurate eigen-decomposition approach, using much less memory than other method

Cumani, Sandro

Laface, Pietro

PORTO Publications Open Repository TOrino

Factorized Sub-Space Estimation for Fast and Memory Effective I-vector Extraction

Most of the state–of–the–art speaker recognition systems use a compact representation of spoken utterances referred to as i–vectors. Since the ”standard” i–vector extraction procedure requires large memory structures and is relatively slow, new approaches have recently been proposed that are able to obtain either accurate solutions at the expense of an increase of the computational load, or fast approximate solutions, which are traded
for lower memory costs. We propose a new approach particularly useful for applications that need to minimize their memory requirements. Our solution not only dramatically reduces the storage needs for i–vector extraction, but is also fast. Tested on the female part of the tel-tel extended NIST 2010 evaluation trials, our approach substantially improves the performance with
respect to the fastest but inaccurate eigen-decomposition approach, using much less memory than any other known method

CUMANI, SANDRO

LAFACE, Pietro

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

English

Fast and Memory Effective I-Vector Extraction Using a Factorized Sub-Space

Sandro Cumani

Pietro Laface

Crossref

IEEE/ACM Transactions on Audio Speech and Language Processing

Most of the state–of–the–art speaker recognition systems use a compact representation of spoken utterances referred to as i–vector. Since the “standard” i–vector extraction procedure requires large memory structures and is relatively slow, new approaches have recently been proposed that are able to obtain either accurate solutions at the expense of an increase
of the computational load, or fast approximate solutions, which are traded for lower memory costs. We propose a new approach particularly useful for applications that need to minimize their memory requirements. Our solution not only dramatically reduces the memory needs for i–vector extraction, but is also fast and accurate compared to recently proposed approaches. Tested on the female part of the tel-tel extended NIST 2010 evaluation trials, our approach substantially improves the performance with respect to the fastest but inaccurate eigen-decomposition approach, using much less memory than other methods

https://iris.polito.it/retrieve/handle/11583/2513864/60557/2513864.pdf

Factorized Sub-Space Estimation for Fast and Memory Effective I-vector Extraction

Abstract

Similar works

Full text

Available Versions

PORTO Publications Open Repository TOrino

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)