6,916 research outputs found

    Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function

    Get PDF
    This paper addresses the problems of blind channel identification and multichannel equalization for speech dereverberation and noise reduction. The time-domain cross-relation method is not suitable for blind room impulse response identification, due to the near-common zeros of the long impulse responses. We extend the cross-relation method to the short-time Fourier transform (STFT) domain, in which the time-domain impulse responses are approximately represented by the convolutive transfer functions (CTFs) with much less coefficients. The CTFs suffer from the common zeros caused by the oversampled STFT. We propose to identify CTFs based on the STFT with the oversampled signals and the critical sampled CTFs, which is a good compromise between the frequency aliasing of the signals and the common zeros problem of CTFs. In addition, a normalization of the CTFs is proposed to remove the gain ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for multichannel equalization, in which the sparsity of speech signals is exploited. We propose to perform inverse filtering by minimizing the 1\ell_1-norm of the source signal with the relaxed 2\ell_2-norm fitting error between the micophone signals and the convolution of the estimated source signal and the CTFs used as a constraint. This method is advantageous in that the noise can be reduced by relaxing the 2\ell_2-norm to a tolerance corresponding to the noise power, and the tolerance can be automatically set. The experiments confirm the efficiency of the proposed method even under conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table

    A Low-Cost Robust Distributed Linearly Constrained Beamformer for Wireless Acoustic Sensor Networks with Arbitrary Topology

    Full text link
    We propose a new robust distributed linearly constrained beamformer which utilizes a set of linear equality constraints to reduce the cross power spectral density matrix to a block-diagonal form. The proposed beamformer has a convenient objective function for use in arbitrary distributed network topologies while having identical performance to a centralized implementation. Moreover, the new optimization problem is robust to relative acoustic transfer function (RATF) estimation errors and to target activity detection (TAD) errors. Two variants of the proposed beamformer are presented and evaluated in the context of multi-microphone speech enhancement in a wireless acoustic sensor network, and are compared with other state-of-the-art distributed beamformers in terms of communication costs and robustness to RATF estimation errors and TAD errors

    On the difference-to-sum power ratio of speech and wind noise based on the Corcos model

    Full text link
    The difference-to-sum power ratio was proposed and used to suppress wind noise under specific acoustic conditions. In this contribution, a general formulation of the difference-to-sum power ratio associated with a mixture of speech and wind noise is proposed and analyzed. In particular, it is assumed that the complex coherence of convective turbulence can be modelled by the Corcos model. In contrast to the work in which the power ratio was first presented, the employed Corcos model holds for every possible air stream direction and takes into account the lateral coherence decay rate. The obtained expression is subsequently validated with real data for a dual microphone set-up. Finally, the difference-to- sum power ratio is exploited as a spatial feature to indicate the frame-wise presence of wind noise, obtaining improved detection performance when compared to an existing multi-channel wind noise detection approach.Comment: 5 pages, 3 figures, IEEE-ICSEE Eilat-Israel conference (special session

    FPGA Implementation of Spectral Subtraction for In-Car Speech Enhancement and Recognition

    Get PDF
    The use of speech recognition in noisy environments requires the use of speech enhancement algorithms in order to improve recognition performance. Deploying these enhancement techniques requires significant engineering to ensure algorithms are realisable in electronic hardware. This paper describes the design decisions and process to port the popular spectral subtraction algorithm to a Virtex-4 field-programmable gate array (FPGA) device. Resource analysis shows the final design uses only 13% of the total available FPGA resources. Waveforms and spectrograms presented support the validity of the proposed FPGA design

    Virtual sensors for local, three dimensional, broadband multiple-channel active noise control and the effects on the quiet zones

    Get PDF
    In this paper, two state of the art virtual sensor algorithms, i.e. the Remote Microphone Technique (RMT) and the Kalman filter based Virtual Sensing algorithm (KVS) are compared, in both state space (SS) and finite impulse response (FIR) implementations. The comparison focuses on the accuracy of the estimated sound pressure signals at the virtual locations and is based on actual measurements in a practical situation. The FIR implementation of the RMT algorithm was found to produce the most reliable results. It is implemented in a local, three dimensional, real-time, multiple-channel, broadband active noise control system. With this implementation, the benefits and limitations of the RMT-ANC system on the shape and size of the quiet zones are investigated

    Source localization and denoising: a perspective from the TDOA space

    Full text link
    In this manuscript, we formulate the problem of denoising Time Differences of Arrival (TDOAs) in the TDOA space, i.e. the Euclidean space spanned by TDOA measurements. The method consists of pre-processing the TDOAs with the purpose of reducing the measurement noise. The complete set of TDOAs (i.e., TDOAs computed at all microphone pairs) is known to form a redundant set, which lies on a linear subspace in the TDOA space. Noise, however, prevents TDOAs from lying exactly on this subspace. We therefore show that TDOA denoising can be seen as a projection operation that suppresses the component of the noise that is orthogonal to that linear subspace. We then generalize the projection operator also to the cases where the set of TDOAs is incomplete. We analytically show that this operator improves the localization accuracy, and we further confirm that via simulation.Comment: 25 pages, 9 figure
    corecore