The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition
  Challenge 2023

Cheng, Ming; Jiang, Ning; Li, Ming; Lin, Yuke; Qin, Xiaoyi; Wang, Weiqing; Zhao, Guoqing

The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

Authors: Ming Cheng
Ning Jiang
Ming Li
Yuke Lin
Xiaoyi Qin
Weiqing Wang
Guoqing Zhao
Publication date: 15 August 2023
Publisher

Abstract

This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity detection, clustering-based diarization, overlapped speech detection, and target-speaker voice activity detection, where each procedure has a fused output from 3 sub-models. Finally, we fuse different clustering-based and TSVAD-based diarization systems using DOVER-Lap and achieve the 4.30% diarization error rate (DER), which ranks first place on track 4 of the challenge leaderboard

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.07595

Last time updated on 18/08/2023