Fake audio detection is a growing concern and some relevant datasets have
been designed for research. However, there is no standard public Chinese
dataset under complex conditions.In this paper, we aim to fill in the gap and
design a Chinese fake audio detection dataset (CFAD) for studying more
generalized detection methods. Twelve mainstream speech-generation techniques
are used to generate fake audio. To simulate the real-life scenarios, three
noise datasets are selected for noise adding at five different signal-to-noise
ratios, and six codecs are considered for audio transcoding (format
conversion). CFAD dataset can be used not only for fake audio detection but
also for detecting the algorithms of fake utterances for audio forensics.
Baseline results are presented with analysis. The results that show fake audio
detection methods with generalization remain challenging. The CFAD dataset is
publicly available at: https://zenodo.org/record/8122764.Comment: FAD renamed as CFA