Focus on the Sound around You: Monaural Target Speaker Extraction via
  Distance and Speaker Information

Chen, Jun; Dinkel, Heinrich; Lin, Jiuxin; Wang, Peng; Wang, Yongqing; Wang, Yujun; Wu, Zhiyong; Yan, Zhiyong; Zhang, Junbo

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information

Authors: Jun Chen
Heinrich Dinkel
Jiuxin Lin
Peng Wang
Yongqing Wang
Yujun Wang
Zhiyong Wu
Zhiyong Yan
Junbo Zhang
Publication date: 28 June 2023
Publisher

Abstract

Previously, Target Speaker Extraction (TSE) has yielded outstanding performance in certain application scenarios for speech enhancement and source separation. However, obtaining auxiliary speaker-related information is still challenging in noisy environments with significant reverberation. inspired by the recently proposed distance-based sound separation, we propose the near sound (NS) extractor, which leverages distance information for TSE to reliably extract speaker information without requiring previous speaker enrolment, called speaker embedding self-enrollment (SESE). Full- & sub-band modeling is introduced to enhance our NS-Extractor's adaptability towards environments with significant reverberation. Experimental results on several cross-datasets demonstrate the effectiveness of our improvements and the excellent performance of our proposed NS-Extractor in different application scenarios.Comment: Accepted by InterSpeech202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.16241

Last time updated on 02/07/2023