1 research outputs found
Complex ISNMF: a Phase-Aware Model for Monaural Audio Source Separation
This paper introduces a phase-aware probabilistic model for audio source
separation. Classical source models in the short-term Fourier transform domain
use circularly-symmetric Gaussian or Poisson random variables. This is
equivalent to assuming that the phase of each source is uniformly distributed,
which is not suitable for exploiting the underlying structure of the phase.
Drawing on preliminary works, we introduce here a Bayesian anisotropic Gaussian
source model in which the phase is no longer uniform. Such a model permits us
to favor a phase value that originates from a signal model through a Markov
chain prior structure. The variance of the latent variables are structured with
nonnegative matrix factorization (NMF). The resulting model is called complex
Itakura-Saito NMF (ISNMF) since it generalizes the ISNMF model to the case of
non-isotropic variables. It combines the advantages of ISNMF, which uses a
distortion measure adapted to audio and yields a set of estimates which
preserve the overall energy of the mixture, and of complex NMF, which enables
one to account for some phase constraints. We derive a generalized
expectation-maximization algorithm to estimate the model parameters.
Experiments conducted on a musical source separation task in a semi-informed
setting show that the proposed approach outperforms state-of-the-art
phase-aware separation techniques