2,278 research outputs found

    An adaptive stereo basis method for convolutive blind audio source separation

    Get PDF
    NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, [71, 10-12, June 2008] DOI:neucom.2007.08.02

    Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

    Get PDF
    This paper addresses the problem of speech separation and enhancement from multichannel convolutive and noisy mixtures, \emph{assuming known mixing filters}. We propose to perform the speech separation and enhancement task in the short-time Fourier transform domain, using the convolutive transfer function (CTF) approximation. Compared to time-domain filters, CTF has much less taps, consequently it has less near-common zeros among channels and less computational complexity. The work proposes three speech-source recovery methods, namely: i) the multichannel inverse filtering method, i.e. the multiple input/output inverse theorem (MINT), is exploited in the CTF domain, and for the multi-source case, ii) a beamforming-like multichannel inverse filtering method applying single source MINT and using power minimization, which is suitable whenever the source CTFs are not all known, and iii) a constrained Lasso method, where the sources are recovered by minimizing the ℓ1\ell_1-norm to impose their spectral sparsity, with the constraint that the ℓ2\ell_2-norm fitting cost, between the microphone signals and the mixing model involving the unknown source signals, is less than a tolerance. The noise can be reduced by setting a tolerance onto the noise power. Experiments under various acoustic conditions are carried out to evaluate the three proposed methods. The comparison between them as well as with the baseline methods is presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    Doubly sparse models for multiple filter estimation in sparse echoic environments

    Get PDF
    We consider the estimation of multiple time-domain sparse filters from echoic mixtures of several unknown sources, when the sources are sparse in the time-frequency domain. We propose a sparse filter estimation framework consisting of two steps: a) a clustering step to group the time-frequency points of mixtures where only one source is active, for each source; b) a convex optimisation step to estimate the filters based on a time-frequency domain cross-relation. We propose a new wideband formulation of a frequency domain cross-relation, besides the one based on classical narrowband approximation. The solutions of the convex optimisation problem, formed using the cross-relation, are characterised. Numerical evaluation shows the benefit of using the wideband cross-relation for sparse echoic filter estimation. Further, the potential of the proposed framework for blind estimation of sparse echoic filters is demonstrated in a controlled experimental setting where in the proposed approach outperforms the state of the art blind filter estimation techniques, when the filters are sufficiently sparse

    The influence of random element displacement on DOA estimates obtained with (Khatri-Rao-)root-MUSIC

    Get PDF
    Although a wide range of direction of arrival (DOA) estimation algorithms has been described for a diverse range of array configurations, no specific stochastic analysis framework has been established to assess the probability density function of the error on DOA estimates due to random errors in the array geometry. Therefore, we propose a stochastic collocation method that relies on a generalized polynomial chaos expansion to connect the statistical distribution of random position errors to the resulting distribution of the DOA estimates. We apply this technique to the conventional root-MUSIC and the Khatri-Rao-root-MUSIC methods. According to Monte-Carlo simulations, this novel approach yields a speedup by a factor of more than 100 in terms of CPU-time for a one-dimensional case and by a factor of 56 for a two-dimensional case
    • 

    corecore