1 research outputs found
Subband modeling for spoofing detection in automatic speaker verification
Spectrograms - time-frequency representations of audio signals - have found
widespread use in neural network-based spoofing detection. While deep models
are trained on the fullband spectrum of the signal, we argue that not all
frequency bands are useful for these tasks. In this paper, we systematically
investigate the impact of different subbands and their importance on replay
spoofing detection on two benchmark datasets: ASVspoof 2017 v2.0 and ASVspoof
2019 PA. We propose a joint subband modelling framework that employs n
different sub-networks to learn subband specific features. These are later
combined and passed to a classifier and the whole network weights are updated
during training. Our findings on the ASVspoof 2017 dataset suggest that the
most discriminative information appears to be in the first and the last 1 kHz
frequency bands, and the joint model trained on these two subbands shows the
best performance outperforming the baselines by a large margin. However, these
findings do not generalise on the ASVspoof 2019 PA dataset. This suggests that
the datasets available for training these models do not reflect real world
replay conditions suggesting a need for careful design of datasets for training
replay spoofing countermeasures.Comment: Accepted to the Speaker Odyssey (The Speaker and Language Recognition
Workshop) 2020 conference. 8 page