26 research outputs found
Foreground-Background Ambient Sound Scene Separation
Ambient sound scenes typically comprise multiple short events occurring on
top of a somewhat stationary background. We consider the task of separating
these events from the background, which we call foreground-background ambient
sound scene separation. We propose a deep learning-based separation framework
with a suitable feature normaliza-tion scheme and an optional auxiliary network
capturing the background statistics, and we investigate its ability to handle
the great variety of sound classes encountered in ambient sound scenes, which
have often not been seen in training. To do so, we create single-channel
foreground-background mixtures using isolated sounds from the DESED and
Audioset datasets, and we conduct extensive experiments with mixtures of seen
or unseen sound classes at various signal-to-noise ratios. Our experimental
findings demonstrate the generalization ability of the proposed approach
Similarity-and-Independence-Aware Beamformer: Method for Target Source Extraction using Magnitude Spectrogram as Reference
This study presents a novel method for source extraction, referred to as the
similarity-and-independence-aware beamformer (SIBF). The SIBF extracts the
target signal using a rough magnitude spectrogram as the reference signal. The
advantage of the SIBF is that it can obtain an accurate target signal, compared
to the spectrogram generated by target-enhancing methods such as the speech
enhancement based on deep neural networks (DNNs). For the extraction, we extend
the framework of the deflationary independent component analysis, by
considering the similarity between the reference and extracted target, as well
as the mutual independence of all potential sources. To solve the extraction
problem by maximum-likelihood estimation, we introduce two source model types
that can reflect the similarity. The experimental results from the CHiME3
dataset show that the target signal extracted by the SIBF is more accurate than
the reference signal generated by the DNN.
Index Terms: semiblind source separation, similarity-and-independence-aware
beamformer, deflationary independent component analysis, source modelComment: Accepted in INTERSPEECH 202