685 research outputs found
Sparse Gaussian Process Audio Source Separation Using Spectrum Priors in the Time-Domain
Gaussian process (GP) audio source separation is a time-domain approach that
circumvents the inherent phase approximation issue of spectrogram based
methods. Furthermore, through its kernel, GPs elegantly incorporate prior
knowledge about the sources into the separation model. Despite these compelling
advantages, the computational complexity of GP inference scales cubically with
the number of audio samples. As a result, source separation GP models have been
restricted to the analysis of short audio frames. We introduce an efficient
application of GPs to time-domain audio source separation, without compromising
performance. For this purpose, we used GP regression, together with spectral
mixture kernels, and variational sparse GPs. We compared our method with
LD-PSDTF (positive semi-definite tensor factorization), KL-NMF
(Kullback-Leibler non-negative matrix factorization), and IS-NMF (Itakura-Saito
NMF). Results show that the proposed method outperforms these techniques.Comment: Paper submitted to the 44th International Conference on Acoustics,
Speech, and Signal Processing, ICASSP 2019. To be held in Brighton, United
Kingdom, between May 12 and May 17, 201
Adaptation of speaker-specific bases in non-negative matrix factorization for single channel speech-music separation
This paper introduces a speaker adaptation algorithm for nonnegative matrix factorization (NMF) models. The proposed adaptation algorithm is a combination of Bayesian and subspace model adaptation. The adapted model is used to separate speech signal from a background music signal in a single record. Training speech data for multiple speakers is used with NMF to train a set of basis vectors as a general model for speech signals. The probabilistic interpretation of NMF is used to achieve Bayesian adaptation to adjust the general model with respect to the actual properties of the speech signals that is observed in the mixed signal. The Bayesian adapted model is adapted again by a linear transform, which changes the subspace that the Bayesian adapted model spans to better match the speech signal that is in the mixed signal. The experimental results show that combining Bayesian with linear transform adaptation improves the separation results
Bayesian separation of spectral sources under non-negativity and full additivity constraints
This paper addresses the problem of separating spectral sources which are
linearly mixed with unknown proportions. The main difficulty of the problem is
to ensure the full additivity (sum-to-one) of the mixing coefficients and
non-negativity of sources and mixing coefficients. A Bayesian estimation
approach based on Gamma priors was recently proposed to handle the
non-negativity constraints in a linear mixture model. However, incorporating
the full additivity constraint requires further developments. This paper
studies a new hierarchical Bayesian model appropriate to the non-negativity and
sum-to-one constraints associated to the regressors and regression coefficients
of linear mixtures. The estimation of the unknown parameters of this model is
performed using samples generated using an appropriate Gibbs sampler. The
performance of the proposed algorithm is evaluated through simulation results
conducted on synthetic mixture models. The proposed approach is also applied to
the processing of multicomponent chemical mixtures resulting from Raman
spectroscopy.Comment: v4: minor grammatical changes; Signal Processing, 200
Single channel speech music separation using nonnegative matrix factorization and spectral masks
A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with spectral masks is proposed in this work. The proposed algorithm uses training data of speech and music signals with nonnegative matrix factorization followed by masking to separate the mixed signal. In the training stage, NMF uses the training data to train a set of basis vectors for each source. These bases are trained using NMF in the magnitude spectrum domain. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a linear combination of the trained bases for both sources. The decomposition results are used to build a mask, which explains the contribution of each source in the mixed signal. Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process
Knowledge-aided covariance matrix estimation and adaptive detection in compound-Gaussian noise
We address the problem of adaptive detection of a signal of interest embedded in colored noise modeled in terms of a compound-Gaussian process. The covariance matrices of the primary and the secondary data share a common structure while having different power levels. A Bayesian approach is proposed here, where both the power levels and the structure are assumed to be random, with some appropriate distributions. Within this framework we propose MMSE and MAP estimators of the covariance structure and their application to adaptive detection using the NMF test statistic and an optimized GLRT herein derived. Some results, also conducted in comparison with existing algorithms, are presented to illustrate the performances of the proposed algorithms. The relevant result is that the solutions presented herein allows to improve the performance over conventional ones, especially in presence of a small number of training data
- …