Noise-robust text-dependent speaker identification using cochlear models

Afshar, Saeed (R17598); Islam, Md. Atiqul (S34468); Monk, Travis (R19418); Schaik, Andre van (R16638); Xu, Ying (R19792)

Noise-robust text-dependent speaker identification using cochlear models

Authors: Saeed (R17598) Afshar
Md. Atiqul (S34468) Islam
Travis (R19418) Monk
Andre van (R16638) Schaik
Ying (R19792) Xu
Publication date: 1 January 2022
Publisher: 'Acoustical Society of America (ASA)'
Doi

Abstract

One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans can accurately identify speakers, even in noisy environments. We can leverage our knowledge of the function and anatomy of the human auditory pathway to design SID systems that achieve better noise-robust performance than conventional approaches. We propose a text-dependent SID system based on a real-time cochlear model called cascade of asymmetric resonators with fast-acting compression (CARFAC). We investigate the SID performance of CARFAC on signals corrupted by noise of various types and levels. We compare its performance with conventional auditory feature generators including mel-frequency cepstrum coefficients, frequency domain linear predictions, as well as another biologically inspired model called the auditory nerve model. We show that CARFAC outperforms other approaches when signals are corrupted by noise. Our results are consistent across datasets, types and levels of noise, different speaking speeds, and back-end classifiers. We show that the noise-robust SID performance of CARFAC is largely due to its nonlinear processing of auditory input signals. Presumably, the human auditory system achieves noise-robust performance via inherent nonlinearities as well

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Western Sydney ResearchDirect

oai:researchdirect.westernsydn...

Last time updated on 12/07/2022