685 research outputs found

    Sparse Gaussian Process Audio Source Separation Using Spectrum Priors in the Time-Domain

    Full text link
    Gaussian process (GP) audio source separation is a time-domain approach that circumvents the inherent phase approximation issue of spectrogram based methods. Furthermore, through its kernel, GPs elegantly incorporate prior knowledge about the sources into the separation model. Despite these compelling advantages, the computational complexity of GP inference scales cubically with the number of audio samples. As a result, source separation GP models have been restricted to the analysis of short audio frames. We introduce an efficient application of GPs to time-domain audio source separation, without compromising performance. For this purpose, we used GP regression, together with spectral mixture kernels, and variational sparse GPs. We compared our method with LD-PSDTF (positive semi-definite tensor factorization), KL-NMF (Kullback-Leibler non-negative matrix factorization), and IS-NMF (Itakura-Saito NMF). Results show that the proposed method outperforms these techniques.Comment: Paper submitted to the 44th International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019. To be held in Brighton, United Kingdom, between May 12 and May 17, 201

    Adaptation of speaker-specific bases in non-negative matrix factorization for single channel speech-music separation

    Get PDF
    This paper introduces a speaker adaptation algorithm for nonnegative matrix factorization (NMF) models. The proposed adaptation algorithm is a combination of Bayesian and subspace model adaptation. The adapted model is used to separate speech signal from a background music signal in a single record. Training speech data for multiple speakers is used with NMF to train a set of basis vectors as a general model for speech signals. The probabilistic interpretation of NMF is used to achieve Bayesian adaptation to adjust the general model with respect to the actual properties of the speech signals that is observed in the mixed signal. The Bayesian adapted model is adapted again by a linear transform, which changes the subspace that the Bayesian adapted model spans to better match the speech signal that is in the mixed signal. The experimental results show that combining Bayesian with linear transform adaptation improves the separation results

    Bayesian separation of spectral sources under non-negativity and full additivity constraints

    Get PDF
    This paper addresses the problem of separating spectral sources which are linearly mixed with unknown proportions. The main difficulty of the problem is to ensure the full additivity (sum-to-one) of the mixing coefficients and non-negativity of sources and mixing coefficients. A Bayesian estimation approach based on Gamma priors was recently proposed to handle the non-negativity constraints in a linear mixture model. However, incorporating the full additivity constraint requires further developments. This paper studies a new hierarchical Bayesian model appropriate to the non-negativity and sum-to-one constraints associated to the regressors and regression coefficients of linear mixtures. The estimation of the unknown parameters of this model is performed using samples generated using an appropriate Gibbs sampler. The performance of the proposed algorithm is evaluated through simulation results conducted on synthetic mixture models. The proposed approach is also applied to the processing of multicomponent chemical mixtures resulting from Raman spectroscopy.Comment: v4: minor grammatical changes; Signal Processing, 200

    Single channel speech music separation using nonnegative matrix factorization and spectral masks

    Get PDF
    A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with spectral masks is proposed in this work. The proposed algorithm uses training data of speech and music signals with nonnegative matrix factorization followed by masking to separate the mixed signal. In the training stage, NMF uses the training data to train a set of basis vectors for each source. These bases are trained using NMF in the magnitude spectrum domain. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a linear combination of the trained bases for both sources. The decomposition results are used to build a mask, which explains the contribution of each source in the mixed signal. Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process

    Knowledge-aided covariance matrix estimation and adaptive detection in compound-Gaussian noise

    Get PDF
    We address the problem of adaptive detection of a signal of interest embedded in colored noise modeled in terms of a compound-Gaussian process. The covariance matrices of the primary and the secondary data share a common structure while having different power levels. A Bayesian approach is proposed here, where both the power levels and the structure are assumed to be random, with some appropriate distributions. Within this framework we propose MMSE and MAP estimators of the covariance structure and their application to adaptive detection using the NMF test statistic and an optimized GLRT herein derived. Some results, also conducted in comparison with existing algorithms, are presented to illustrate the performances of the proposed algorithms. The relevant result is that the solutions presented herein allows to improve the performance over conventional ones, especially in presence of a small number of training data
    corecore