The Codecfake Dataset and Countermeasures for the Universally Detection
  of Deepfake Audio

Cheng, Haonan; Fu, Ruibo; Liu, Yukun; Lu, Yi; Qi, Xin; Sun, Yi; Tao, Jianhua; Wang, Xiaopeng; Wang, Zhiyong; Wen, Zhengqi; Xie, Yuankun; Ye, Long

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

Authors: Haonan Cheng
Ruibo Fu
Yukun Liu
Yi Lu
Xin Qi
Yi Sun
Jianhua Tao
Xiaopeng Wang
Zhiyong Wang
Zhengqi Wen
Yuankun Xie
Long Ye
Publication date: 15 May 2024
Publisher

Abstract

With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially construct the Codecfake dataset, an open-source large-scale dataset, including 2 languages, over 1M audio samples, and various test conditions, focus on ALM-based audio detection. As countermeasure, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original SAM, we propose the CSAM strategy to learn a domain balanced and generalized minima. In our experiments, we first demonstrate that ADD model training with the Codecfake dataset can effectively detects ALM-based audio. Furthermore, our proposed generalization countermeasure yields the lowest average Equal Error Rate (EER) of 0.616% across all test conditions compared to baseline models. The dataset and associated code are available online

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2405.04880

Last time updated on 08/12/2024