285 research outputs found

    Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

    Get PDF

    Multi-channel dereverberation for speech intelligibility improvement in hearing aid applications

    Get PDF

    PSD Estimation and Source Separation in a Noisy Reverberant Environment using a Spherical Microphone Array

    Get PDF
    In this paper, we propose an efficient technique for estimating individual power spectral density (PSD) components, i.e., PSD of each desired sound source as well as of noise and reverberation, in a multi-source reverberant sound scene with coherent background noise. We formulate the problem in the spherical harmonics domain to take the advantage of the inherent orthogonality of the spherical harmonics basis functions and extract the PSD components from the cross-correlation between the different sound field modes. We also investigate an implementation issue that occurs at the nulls of the Bessel functions and offer an engineering solution. The performance evaluation takes place in a practical environment with a commercial microphone array in order to measure the robustness of the proposed algorithm against all the deviations incurred in practice. We also exhibit an application of the proposed PSD estimator through a source septation algorithm and compare the performance with a contemporary method in terms of different objective measures

    Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

    Get PDF
    We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201
    • …
    corecore