273 research outputs found
Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function
This paper addresses the problems of blind channel identification and
multichannel equalization for speech dereverberation and noise reduction. The
time-domain cross-relation method is not suitable for blind room impulse
response identification, due to the near-common zeros of the long impulse
responses. We extend the cross-relation method to the short-time Fourier
transform (STFT) domain, in which the time-domain impulse responses are
approximately represented by the convolutive transfer functions (CTFs) with
much less coefficients. The CTFs suffer from the common zeros caused by the
oversampled STFT. We propose to identify CTFs based on the STFT with the
oversampled signals and the critical sampled CTFs, which is a good compromise
between the frequency aliasing of the signals and the common zeros problem of
CTFs. In addition, a normalization of the CTFs is proposed to remove the gain
ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for
multichannel equalization, in which the sparsity of speech signals is
exploited. We propose to perform inverse filtering by minimizing the
-norm of the source signal with the relaxed -norm fitting error
between the micophone signals and the convolution of the estimated source
signal and the CTFs used as a constraint. This method is advantageous in that
the noise can be reduced by relaxing the -norm to a tolerance
corresponding to the noise power, and the tolerance can be automatically set.
The experiments confirm the efficiency of the proposed method even under
conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table
Multichannel Online Dereverberation based on Spectral Magnitude Inverse Filtering
This paper addresses the problem of multichannel online dereverberation. The
proposed method is carried out in the short-time Fourier transform (STFT)
domain, and for each frequency band independently. In the STFT domain, the
time-domain room impulse response is approximately represented by the
convolutive transfer function (CTF). The multichannel CTFs are adaptively
identified based on the cross-relation method, and using the recursive least
square criterion. Instead of the complex-valued CTF convolution model, we use a
nonnegative convolution model between the STFT magnitude of the source signal
and the CTF magnitude, which is just a coarse approximation of the former
model, but is shown to be more robust against the CTF perturbations. Based on
this nonnegative model, we propose an online STFT magnitude inverse filtering
method. The inverse filters of the CTF magnitude are formulated based on the
multiple-input/output inverse theorem (MINT), and adaptively estimated based on
the gradient descent criterion. Finally, the inverse filtering is applied to
the STFT magnitude of the microphone signals, obtaining an estimate of the STFT
magnitude of the source signal. Experiments regarding both speech enhancement
and automatic speech recognition are conducted, which demonstrate that the
proposed method can effectively suppress reverberation, even for the difficult
case of a moving speaker.Comment: Paper submitted to IEEE/ACM Transactions on Audio, Speech and
Language Processing. IEEE Signal Processing Letters, 201
Inverse filtering and principal component analysis techniques for speech dereverberation
In this work, we present a single channel approach for early and late reverberation suppression. This approach can be decomposed into two stages. The first stage employs the inverse filter to augment the signal-to-reverberant energy ratio. The second stage uses the kernel PCA algorithm to enhance the obtained dereverberant signal. It consists in extracting the main non-linear features from the speech signal after inverse filtering. Our approach appears to be efficient mainly in far field conditions and in highly reverberant environments
- …