55 research outputs found
Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
This paper describes an efficient unsupervised learning method for a neural
source separation model that utilizes a probabilistic generative model of
observed multichannel mixtures proposed for blind source separation (BSS). For
this purpose, amortized variational inference (AVI) has been used for directly
solving the inverse problem of BSS with full-rank spatial covariance analysis
(FCA). Although this unsupervised technique called neural FCA is in principle
free from the domain mismatch problem, it is computationally demanding due to
the full rankness of the spatial model in exchange for robustness against
relatively short reverberations. To reduce the model complexity without
sacrificing performance, we propose neural FastFCA based on the
jointly-diagonalizable yet full-rank spatial model. Our neural separation model
introduced for AVI alternately performs neural network blocks and single steps
of an efficient iterative algorithm called iterative source steering. This
alternating architecture enables the separation model to quickly separate the
mixture spectrogram by leveraging both the deep neural network and the
multichannel optimization algorithm. The training objective with AVI is derived
to maximize the marginalized likelihood of the observed mixtures. The
experiment using mixture signals of two to four sound sources shows that neural
FastFCA outperforms conventional BSS methods and reduces the computational time
to about 2% of that for the neural FCA.Comment: 5 pages, 2 figures, accepted to EUSIPCO 202
- …