Search CORE

250,887 research outputs found

Improving Sound Event Detection In Domestic Environments Using Sound Separation

Author: Erdogan Hakan
Fonseca Eduardo
Hershey John
Salamon Justin
Seetharaman Prem
Serizel Romain
Turpault Nicolas
Wisdom Scott
Publication venue
Publication date: 08/07/2020
Field of study

Performing sound event detection on real-world recordings often implies dealing with overlapping target sound events and non-target sounds, also referred to as interference or noise. Until now these problems were mainly tackled at the classifier level. We propose to use sound separation as a pre-processing for sound event detection. In this paper we start from a sound separation model trained on the Free Universal Sound Separation dataset and the DCASE 2020 task 4 sound event detection baseline. We explore different methods to combine separated sound sources and the original mixture within the sound event detection. Furthermore, we investigate the impact of adapting the sound separation model to the sound event detection data on both the sound separation and the sound event detection

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Sampling-Frequency-Independent Universal Sound Separation

Author: Nakamura Tomohiko
Yatabe Kohei
Publication venue
Publication date: 21/09/2023
Field of study

This paper proposes a universal sound separation (USS) method capable of handling untrained sampling frequencies (SFs). The USS aims at separating arbitrary sources of different types and can be the key technique to realize a source separator that can be universally used as a preprocessor for any downstream tasks. To realize a universal source separator, there are two essential properties: universalities with respect to source types and recording conditions. The former property has been studied in the USS literature, which has greatly increased the number of source types that can be handled by a single neural network. However, the latter property (e.g., SF) has received less attention despite its necessity. Since the SF varies widely depending on the downstream tasks, the universal source separator must handle a wide variety of SFs. In this paper, to encompass the two properties, we propose an SF-independent (SFI) extension of a computationally efficient USS network, SuDoRM-RF. The proposed network uses our previously proposed SFI convolutional layers, which can handle various SFs by generating convolutional kernels in accordance with an input SF. Experiments show that signal resampling can degrade the USS performance and the proposed method works more consistently than signal-resampling-based methods for various SFs.Comment: Submitted to ICASSP202

arXiv.org e-Print Archive

Audio Prompt Tuning for Universal Sound Separation

Author: Liu Xubo
Liu Yuzhuo
Tain Pingchuan
Wang Yuanyuan
Wang Yuxuan
Xia Rui
Zhao Yan
Publication venue
Publication date: 30/11/2023
Field of study

Universal sound separation (USS) is a task to separate arbitrary sounds from an audio mixture. Existing USS systems are capable of separating arbitrary sources, given a few examples of the target sources as queries. However, separating arbitrary sounds with a single system is challenging, and the robustness is not always guaranteed. In this work, we propose audio prompt tuning (APT), a simple yet effective approach to enhance existing USS systems. Specifically, APT improves the separation performance of specific sources through training a small number of prompt parameters with limited audio samples, while maintaining the generalization of the USS model by keeping its parameters frozen. We evaluate the proposed method on MUSDB18 and ESC-50 datasets. Compared with the baseline model, APT can improve the signal-to-distortion ratio performance by 0.67 dB and 2.06 dB using the full training set of two datasets. Moreover, APT with only 5 audio samples even outperforms the baseline systems utilizing full training data on the ESC-50 dataset, indicating the great potential of few-shot APT

arXiv.org e-Print Archive

Improving Sound Event Detection In Domestic Environments Using Sound Separation

Author: Erdogan Hakan
Fonseca Eduardo
Hershey John,
Salamon Justin
Seetharaman Prem
Serizel Romain
Turpault Nicolas
Wisdom Scott
Publication venue: HAL CCSD
Publication date: 01/11/2020
Field of study

International audiencePerforming sound event detection on real-world recordings often implies dealing with overlapping target sound events and non-target sounds, also referred to as interference or noise. Until now these problems were mainly tackled at the classifier level. We propose to use sound separation as a pre-processing for sound event detection. In this paper we start from a sound separation model trained on the Free Universal Sound Separation dataset and the DCASE 2020 task 4 sound event detection baseline. We explore different methods to combine separated sound sources and the original mixture within the sound event detection. Furthermore, we investigate the impact of adapting the sound separation model to the sound event detection data on both the sound separation and the sound event detection

INRIA a CCSD electronic archive server

Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation

Author: Deng Shijian
Su Yiyang
Tian Yapeng
Vosoughi Ali
Xu Chenliang
Publication venue
Publication date: 18/10/2023
Field of study

The audio-visual sound separation field assumes visible sources in videos, but this excludes invisible sounds beyond the camera's view. Current methods struggle with such sounds lacking visible cues. This paper introduces a novel "Audio-Visual Scene-Aware Separation" (AVSA-Sep) framework. It includes a semantic parser for visible and invisible sounds and a separator for scene-informed separation. AVSA-Sep successfully separates both sound types, with joint training and cross-modal alignment enhancing effectiveness.Comment: Accepted at ICCV 2023 - AV4D, 4 figures, 3 table

arXiv.org e-Print Archive