Exploring sub-band cepstral distances for more robust speaker classification

Abstract

This paper presents the first of two-part exploration into the potential of parametric cepstral distance (PCD) as a forensic voice comparison feature, based on Japanese vowel data collected from 306 male native speakers under microphone and mobile transmission conditions. The behaviours of PCDs were closely examined by altering sub-band settings, and we found the behaviour of PCDs to correspond well to what is known about formants, which suggests that PCDs are relatable to articulatory gestures. Comparison between sub-band and full-band PCD revealed that limiting the band range to a specific frequency region makes the feature more robust against channel mismatch, encouraging further examination of this potential feature.The work presented here was partly supported by JSPS KAKENHI Grant Number JP18H01671, JP25350488

    Similar works