Search CORE

685 research outputs found

Sparse Gaussian Process Audio Source Separation Using Spectrum Priors in the Time-Domain

Author: Alvarado Pablo A.
Stowell Dan
Álvarez Mauricio A.
Publication venue
Publication date: 21/11/2018
Field of study

Gaussian process (GP) audio source separation is a time-domain approach that circumvents the inherent phase approximation issue of spectrogram based methods. Furthermore, through its kernel, GPs elegantly incorporate prior knowledge about the sources into the separation model. Despite these compelling advantages, the computational complexity of GP inference scales cubically with the number of audio samples. As a result, source separation GP models have been restricted to the analysis of short audio frames. We introduce an efficient application of GPs to time-domain audio source separation, without compromising performance. For this purpose, we used GP regression, together with spectral mixture kernels, and variational sparse GPs. We compared our method with LD-PSDTF (positive semi-definite tensor factorization), KL-NMF (Kullback-Leibler non-negative matrix factorization), and IS-NMF (Itakura-Saito NMF). Results show that the proposed method outperforms these techniques.Comment: Paper submitted to the 44th International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019. To be held in Brighton, United Kingdom, between May 12 and May 17, 201

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository

Adaptation of speaker-specific bases in non-negative matrix factorization for single channel speech-music separation

Author: Erdoğan Hakan
Grais Emad Mounir
Publication venue: ISCA (International Speech Communication Association)
Publication date: 01/08/2011
Field of study

This paper introduces a speaker adaptation algorithm for nonnegative matrix factorization (NMF) models. The proposed adaptation algorithm is a combination of Bayesian and subspace model adaptation. The adapted model is used to separate speech signal from a background music signal in a single record. Training speech data for multiple speakers is used with NMF to train a set of basis vectors as a general model for speech signals. The probabilistic interpretation of NMF is used to achieve Bayesian adaptation to adjust the general model with respect to the actual properties of the speech signals that is observed in the mixed signal. The Bayesian adapted model is adapted again by a linear transform, which changes the subspace that the Bayesian adapted model spans to better match the speech signal that is in the mixed signal. The experimental results show that combining Bayesian with linear transform adaptation improves the separation results

Sabanci University Research Database

Bayesian separation of spectral sources under non-negativity and full additivity constraints

Author: Bishop
Cardoso
Carteret
Chang
Cichocki
Comon
Comon
Cédric Carteret
Dandeu
de Juan
Djurić
Dobigeon
Dobigeon
Dobigeon
Févotte
Gelman
Gelman
Godsill
Hoyer
Hsiao
Hyvärinen
Jean-Yves Tourneret
Kitamura
Lee
Malinowski
Mazet
Miskin
Moussaoui
Moussaoui
Nicolas Dobigeon
Plumbley
Plumbley
Punskaya
Robert
Robert
Robert
Robert
Sajda
Saïd Moussaoui
Snoussi
Tugnait
Publication venue: 'Elsevier BV'
Publication date: 23/09/2009
Field of study

This paper addresses the problem of separating spectral sources which are linearly mixed with unknown proportions. The main difficulty of the problem is to ensure the full additivity (sum-to-one) of the mixing coefficients and non-negativity of sources and mixing coefficients. A Bayesian estimation approach based on Gamma priors was recently proposed to handle the non-negativity constraints in a linear mixture model. However, incorporating the full additivity constraint requires further developments. This paper studies a new hierarchical Bayesian model appropriate to the non-negativity and sum-to-one constraints associated to the regressors and regression coefficients of linear mixtures. The estimation of the unknown parameters of this model is performed using samples generated using an appropriate Gibbs sampler. The performance of the proposed algorithm is evaluated through simulation results conducted on synthetic mixture models. The proposed approach is also applied to the processing of multicomponent chemical mixtures resulting from Raman spectroscopy.Comment: v4: minor grammatical changes; Signal Processing, 200

arXiv.org e-Print Archive

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Single channel speech music separation using nonnegative matrix factorization and spectral masks

Author: Erdogan Hakan
Erdoğan Hakan
Grais Emad Mounir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with spectral masks is proposed in this work. The proposed algorithm uses training data of speech and music signals with nonnegative matrix factorization followed by masking to separate the mixed signal. In the training stage, NMF uses the training data to train a set of basis vectors for each source. These bases are trained using NMF in the magnitude spectrum domain. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a linear combination of the trained bases for both sources. The decomposition results are used to build a mask, which explains the contribution of each source in the mixed signal. Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process

Sabanci University Research Database

Surrey Research Insight

Knowledge-aided covariance matrix estimation and adaptive detection in compound-Gaussian noise

Author: Bandiera Francesco
Besson Olivier
Ricci Giuseppe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

We address the problem of adaptive detection of a signal of interest embedded in colored noise modeled in terms of a compound-Gaussian process. The covariance matrices of the primary and the secondary data share a common structure while having different power levels. A Bayesian approach is proposed here, where both the power levels and the structure are assumed to be random, with some appropriate distributions. Within this framework we propose MMSE and MAP estimators of the covariance structure and their application to adaptive detection using the NMF test statistic and an optimized GLRT herein derived. Some results, also conducted in comparison with existing algorithms, are presented to illustrate the performances of the proposed algorithms. The relevant result is that the solutions presented herein allows to improve the performance over conventional ones, especially in presence of a small number of training data

Open Archive Toulouse Archive Ouverte

Archivio Istituzionale della Ricerca- Università del Salento