In this paper, we aim to address the problem of channel robustness in speech
countermeasure (CM) systems, which are used to distinguish synthetic speech
from human natural speech. On the basis of two hypotheses, we suggest an
approach for perturbing phase information during the training of time-domain CM
systems. Communication networks often employ lossy compression codec that
encodes only magnitude information, therefore heavily altering phase
information. Also, state-of-the-art CM systems rely on phase information to
identify spoofed speech. Thus, we believe the information loss in the phase
domain induced by lossy compression codec degrades the performance of the
unseen channel. We first establish the dependence of time-domain CM systems on
phase information by perturbing phase in evaluation, showing strong
degradation. Then, we demonstrated that perturbing phase during training leads
to a significant performance improvement, whereas perturbing magnitude leads to
further degradation.Comment: 5 pages; Accepted to INTERSPEECH 202