24 research outputs found

    An overview of informed audio source separation

    Get PDF
    International audienceAudio source separation consists in recovering different unknown signals called sources by filtering their observed mixtures. In music processing, most mixtures are stereophonic songs and the sources are the individual signals played by the instruments, e.g. bass, vocals, guitar, etc. Source separation is often achieved through a classical generalized Wiener filtering, which is controlled by parameters such as the power spectrograms and the spatial locations of the sources. For an efficient filtering, those parameters need to be available and their estimation is the main challenge faced by separation algorithms. In the blind scenario, only the mixtures are available and performance strongly depends on the mixtures considered. In recent years, much research has focused on informed separation, which consists in using additional available information about the sources to improve the separation quality. In this paper, we review some recent trends in this direction

    On the Use of Masking Filters in Sound Source Separation

    Get PDF
    Many sound source separation algorithms, such as NMF and related approaches, disregard phase information and operate only on magnitude or power spectrograms. In this context, generalised Wiener filters have been widely used to generate masks which are applied to the original complex-valued spectrogram before inversion to the time domain, as these masks have been shown to give good results. However, these masks may not be optimal from a perceptual point of view. To this end, we propose new families of masks and compare their performance to generalised Wiener filter masks using three different factorisation-based separation algorithms. Further, to-date no analysis of how the performance of masking varies with the number of iterations performed when estimating the separated sources. We perform such an analysis and show that when using these masks, running to convergence may not be required in order to obtain good separation performance

    User Assisted Separation of Repeating Patterns in Time and Frequency using Magnitude Projections

    Get PDF
    International audienceIn this paper, we propose a simple user-assisted method for the recovery of repeating patterns in time and frequency which can occur in audio mixtures. Here, the user selects a region in a log-frequency spectrogram from which they seek to recover the underlying pattern, such as a repeating chord masked by a cough. Cross-correlation is then performed between the selected region and the spectrogram, revealing similar regions. The most similar region is then selected and a variant on the PROJET algorithm, termed PROJET-MAG, is used to extract the common time-frequency components from the two regions , as well as extracting the components which are not common. The results obtained are compared to another user-assisted method based on REPET, and PROJET-MAG is demonstrated to give improved results over this baseline

    An interactive audio source separation framework based on non-negative matrix factorization

    Full text link

    Generalized Wiener filtering with fractional power spectrograms

    Get PDF
    International audienceIn the recent years, many studies have focused on the single-sensor separation of independent waveforms using so-called soft-masking strategies, where the short term Fourier transform of the mixture is multiplied element-wise by a ratio of spectrogram models. When the signals are wide-sense stationary, this strategy is theoretically justified as an optimal Wiener filtering: the power spectrograms of the sources are supposed to add up to yield the power spectrogram of the mixture. However, experience shows that using fractional spectrograms instead, such as the amplitude, yields good performance in practice, because they experimentally better fit the additivity assumption. To the best of our knowledge, no probabilistic interpretation of this filtering procedure was available to date. In this paper, we show that assuming the additivity of fractional spectrograms for the purpose of building soft-masks can be understood as separating locally stationary alpha-stable harmonizable processes, alpha-harmonizable in short, thus justifying the procedure theoretically
    corecore