Search CORE

15 research outputs found

Loquendo - Politecnico di Torino’s 2008 NIST Speaker Recognition Evaluation System

Author: CASTALDO F
COLIBRO D
COLIBRO D
DALMASSO E
LAFACE P.
VAIR C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition

Author: Cumani Sandro
Laface Pietro
Publication venue: Piscataway, N.J. : IEEE
Publication date: 01/01/2012
Field of study

This paper compares a set of large scale support vector machine (SVM) training algorithms for language and speaker recognition tasks.We analyze five approaches for training phonetic and acoustic SVM models for language recognition. We compare the performance of these approaches as a function of the training time required by each of them to reach convergence, and we discuss their scalability towards large corpora. Two of these algorithms can be used in speaker recognition to train a SVM that classifies pairs of utterances as either belonging to the same speaker or to two different speakers. Our results show that the accuracy of these algorithms is asymptotically equivalent, but they have different behavior with respect to the time required to converge. Some of these algorithms not only scale linearly with the training set size, but are also able to give their best results after just a few iterations. State-of-the-art performance has been obtained in the female subset of the NIST 2010 Speaker Recognition Evaluation extended core test using a single SVM syste

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Comparison of speaker recognition approaches for real applications

Author: Batzu P.D.
Colibro D.
Cumani Sandro
Laface Pietro
Vair C.
Vasilakakis Vasileios
Publication venue: ISCA
Publication date: 01/01/2011
Field of study

PORTO Publications Open Repository TOrino

Comparison of Large-scale SVM Training Algorithms for Language Recognition

Author: Castaldo Fabio
Colibro D.
Cumani Sandro
Laface Pietro
Vair C.
Publication venue: ISCA
Publication date: 01/01/2010
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Comparison of speaker recognition approaches for real applications

Author: Batzu P.D.
Colibro D.
Cumani S.
Laface P.
Vair C.
Vasilakakis V.
Publication venue: ISCA
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Memory and computation trade-offs for efficient i-vector extraction

Author: Cumani Sandro
Laface Pietro
Publication venue: IEEE - INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date: 01/01/2013
Field of study

This work aims at reducing the memory demand of the data structures that are usually pre-computed and stored for fast computation of the i-vectors, a compact representation of spoken utterances that is used by most state-of-the-art speaker recognition systems. We propose two new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices, and with the recently proposed fast eigen-decomposition technique. The first approach computes an i-vector in a Variational Bayes (VB) framework by iterating the estimation of one sub-block of i-vector elements at a time, keeping fixed all the others, and can obtain i-vectors as accurate as the ones obtained by the standard technique but requiring only 25% of its memory. The second technique is based on the Conjugate Gradient solution of a linear system, which is accurate and uses even less memory, but is slower than the VB approach. We analyze and compare the time and memory resources required by all these solutions, which are suited to different applications, and we show that it is possible to get accurate results greatly reducing memory demand compared with the standard solution at almost the same speed

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Language Recognition Using Language Factors

Author: CASTALDO F
COLIBRO D
CUMANI S
LAFACE P.
Publication venue: 'The International Fiscal Association of Korea'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Memory and computation effective approaches for i-vector extraction

Author: Cumani Sandro
Laface Pietro
Vasilakakis Vasileios
Publication venue: ISCA
Publication date: 01/01/2012
Field of study

This paper focuses on the extraction of i-vectors, a compact representation of spoken utterances that is used by most of the state-of-the-art speaker recognition systems. This work was mainly motivated by the need of reducing the memory demand of the huge data structures that are usually precomputed for fast computation of the i-vectors. We propose a set of new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices. We analyze the time and memory resources required by these solutions, which are suited to different fields of application, and we show that it is possible to get accurate results with solutions that reduce both computation time and memory demand compared with the standard solutio

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Language Recognition Using Language Factors

Author: Castaldo Fabio
Colibro D.
Cumani Sandro
Laface Pietro
Publication venue: ISCA
Publication date: 01/01/2009
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Memory and computation effective approaches for i–vector extraction

Author: Cumani S.
Laface P.
Vasilakakis V.
Publication venue: ISCA
Publication date
Field of study

This paper focuses on the extraction of i-vectors, a compact representation of spoken utterances that is used by most of the state–of–the–art speaker recognition systems. This work was mainly motivated by the need of reducing the memory demand of the huge data structures that are usually precomputed for fast computation of the i-vectors. We propose a set of new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices. We analyze the time and memory resources required by these solutions, which are suited to different fields of application, and we show that it is possible to get accurate results with solutions that reduce both computation time and memory demand compared with the standard solution

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)