Search CORE

17 research outputs found

Compensation of Nuisance Factors for Speaker and Language Recognition

Author: CASTALDO F.
COLIBRO D
DALMASSO E
LAFACE P
VAIR C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian mixture models, while in the feature domain blind channel compensation is usually performed. The aim of this work is to explore techniques that allow more accurate intersession compensation in the feature domain. Compensating the features rather than the models has the advantage that the transformed parameters can be used with models of a different nature and complexity and for different tasks. In this paper, we evaluate the effects of the compensation of the intersession variability obtained by means of the channel factors approach. In particular, we compare channel variability modeling in the usual Gaussian mixture model domain, and our proposed feature domain compensation technique. We show that the two approaches lead to similar results on the NIST 2005 Speaker Recognition Evaluation data with a reduced computation cost. We also report the results of a system, based on the intersession compensation technique in the feature space that was among the best participants in the NIST 2006 Speaker Recognition Evaluation. Moreover, we show how we obtained significant performance improvement in language recognition by estimating and compensating, in the feature domain, the distortions due to interspeaker variability within the same language. Index Terms—Factor anal

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Politecnico di Torino System for the 2007 NIST Language Recognition Evaluation

Author: Castaldo Fabio
Colibro D.
Dalmasso E.
Laface Pietro
Vair C.
Publication venue: ISCA
Publication date: 01/01/2008
Field of study

PORTO Publications Open Repository TOrino

Language Recognition Using Language Factors

Author: Castaldo Fabio
Colibro D.
Cumani Sandro
Laface Pietro
Publication venue: ISCA
Publication date: 01/01/2009
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Politecnico di Torino System for the 2007 NIST Language Recognition Evaluation

Author: CASTALDO FABIO
COLIBRO D
DALMASSO E
LAFACE Pietro
VAIR C.
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/2008
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Nuance - Politecnico di Torino's 2012 NIST Speaker Recognition Evaluation System

Author: Colibro D.
Cumani Sandro
Farrell K.
Karvitsky G.
Krause N.
Laface Pietro
Vairc.
Publication venue: International Speech Communication Association
Publication date: 01/01/2013
Field of study

This paper describes the Nuance-Politecnico di Torino (NPT) speaker recognition system submitted to the NIST SRE12 evaluation campaign. Included are the results of postevaluation tests, focusing on the analysis of the effects of score normalization and condition-dependent calibration. The submitted system combines the results of five acoustic recognizers all based on Gaussian Mixture Models (GMMs). Each system has its own front end, with features differing by their type and dimension. We illustrate the process of development data selection and configuration of state-of-the-art technology, which contributed to obtaining good performance in all the test conditions proposed in this evaluation

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Language Recognition Using Language Factors

Author: CASTALDO F
COLIBRO D
CUMANI S
LAFACE P.
Publication venue: 'The International Fiscal Association of Korea'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Loquendo - Politecnico di Torino’s 2008 NIST Speaker Recognition Evaluation System

Author: CASTALDO F
COLIBRO D
COLIBRO D
DALMASSO E
LAFACE P.
VAIR C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Nuance - Politecnico di Torino’s 2012 NIST Speaker Recognition Evaluation System

Author: Colibro D.
Cumani S.
Farrell K.
Karvitsky G.
Krause N.
Laface P.
VairC.
Publication venue: 'International Speech Communication Association'
Publication date
Field of study

This paper describes the Nuance–Politecnico di Torino (NPT) speaker recognition system submitted to the NIST SRE12 evaluation campaign. Included are the results of postevaluation tests, focusing on the analysis of the effects of score normalization and condition-dependent calibration. The submitted system combines the results of five acoustic recognizers all based on Gaussian Mixture Models (GMMs). Each system has its own front end, with features differing by their type and dimension. We illustrate the process of development data selection and configuration of state-of-the-art technology, which contributed to obtaining good performance in all the test conditions proposed in this evaluation

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Comparison of Large-scale SVM Training Algorithms for Language Recognition

Author: Castaldo Fabio
Colibro D.
Cumani Sandro
Laface Pietro
Vair C.
Publication venue: ISCA
Publication date: 01/01/2010
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Audio segmentation-by-classification approach based on factor analysis in broadcast news domain

Author: Castán D.
Lleida E.
Miguel A.
Ortega A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This paper studies a novel audio segmentation-by-classification approach based on factor analysis. The proposed technique compensates the within-class variability by using class-dependent factor loading matrices and obtains the scores by computing the log-likelihood ratio for the class model to a non-class model over fixed-length windows. Afterwards, these scores are smoothed to yield longer contiguous segments of the same class by means of different back-end systems. Unlike previous solutions, our proposal does not make use of specific acoustic features and does not need a hierarchical structure. The proposed method is applied to segment and classify audios coming from TV shows into five different acoustic classes: speech, music, speech with music, speech with noise, and others. The technique is compared to a hierarchical system with specific acoustic features achieving a significant error reduction

Crossref

Repositorio Universidad de Zaragoza

Springer - Publisher Connector