6,917 research outputs found
Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function
This paper addresses the problems of blind channel identification and
multichannel equalization for speech dereverberation and noise reduction. The
time-domain cross-relation method is not suitable for blind room impulse
response identification, due to the near-common zeros of the long impulse
responses. We extend the cross-relation method to the short-time Fourier
transform (STFT) domain, in which the time-domain impulse responses are
approximately represented by the convolutive transfer functions (CTFs) with
much less coefficients. The CTFs suffer from the common zeros caused by the
oversampled STFT. We propose to identify CTFs based on the STFT with the
oversampled signals and the critical sampled CTFs, which is a good compromise
between the frequency aliasing of the signals and the common zeros problem of
CTFs. In addition, a normalization of the CTFs is proposed to remove the gain
ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for
multichannel equalization, in which the sparsity of speech signals is
exploited. We propose to perform inverse filtering by minimizing the
-norm of the source signal with the relaxed -norm fitting error
between the micophone signals and the convolution of the estimated source
signal and the CTFs used as a constraint. This method is advantageous in that
the noise can be reduced by relaxing the -norm to a tolerance
corresponding to the noise power, and the tolerance can be automatically set.
The experiments confirm the efficiency of the proposed method even under
conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table
A Low-Cost Robust Distributed Linearly Constrained Beamformer for Wireless Acoustic Sensor Networks with Arbitrary Topology
We propose a new robust distributed linearly constrained beamformer which
utilizes a set of linear equality constraints to reduce the cross power
spectral density matrix to a block-diagonal form. The proposed beamformer has a
convenient objective function for use in arbitrary distributed network
topologies while having identical performance to a centralized implementation.
Moreover, the new optimization problem is robust to relative acoustic transfer
function (RATF) estimation errors and to target activity detection (TAD)
errors. Two variants of the proposed beamformer are presented and evaluated in
the context of multi-microphone speech enhancement in a wireless acoustic
sensor network, and are compared with other state-of-the-art distributed
beamformers in terms of communication costs and robustness to RATF estimation
errors and TAD errors
On the difference-to-sum power ratio of speech and wind noise based on the Corcos model
The difference-to-sum power ratio was proposed and used to suppress wind
noise under specific acoustic conditions. In this contribution, a general
formulation of the difference-to-sum power ratio associated with a mixture of
speech and wind noise is proposed and analyzed. In particular, it is assumed
that the complex coherence of convective turbulence can be modelled by the
Corcos model. In contrast to the work in which the power ratio was first
presented, the employed Corcos model holds for every possible air stream
direction and takes into account the lateral coherence decay rate. The obtained
expression is subsequently validated with real data for a dual microphone
set-up. Finally, the difference-to- sum power ratio is exploited as a spatial
feature to indicate the frame-wise presence of wind noise, obtaining improved
detection performance when compared to an existing multi-channel wind noise
detection approach.Comment: 5 pages, 3 figures, IEEE-ICSEE Eilat-Israel conference (special
session
FPGA Implementation of Spectral Subtraction for In-Car Speech Enhancement and Recognition
The use of speech recognition in noisy environments requires the use of speech enhancement algorithms in order to improve recognition performance. Deploying these enhancement techniques requires significant engineering to ensure algorithms are realisable in electronic hardware. This paper describes the design decisions and process to port the popular spectral subtraction algorithm to a Virtex-4 field-programmable gate array (FPGA) device. Resource analysis shows the final design uses only 13% of the total available FPGA resources. Waveforms and spectrograms presented support the validity of the proposed FPGA design
Virtual sensors for local, three dimensional, broadband multiple-channel active noise control and the effects on the quiet zones
In this paper, two state of the art virtual sensor algorithms, i.e. the Remote Microphone Technique (RMT) and the Kalman filter based Virtual Sensing algorithm (KVS) are compared, in both state space (SS) and finite impulse response (FIR) implementations. The comparison focuses on the accuracy of the estimated sound pressure signals at the virtual locations and is based on actual measurements in a practical situation. The FIR implementation of the RMT algorithm was found to produce the most reliable results. It is implemented in a local, three dimensional, real-time, multiple-channel, broadband active noise control system. With this implementation, the benefits and limitations of the RMT-ANC system on the shape and size of the quiet zones are investigated
Source localization and denoising: a perspective from the TDOA space
In this manuscript, we formulate the problem of denoising Time Differences of
Arrival (TDOAs) in the TDOA space, i.e. the Euclidean space spanned by TDOA
measurements. The method consists of pre-processing the TDOAs with the purpose
of reducing the measurement noise. The complete set of TDOAs (i.e., TDOAs
computed at all microphone pairs) is known to form a redundant set, which lies
on a linear subspace in the TDOA space. Noise, however, prevents TDOAs from
lying exactly on this subspace. We therefore show that TDOA denoising can be
seen as a projection operation that suppresses the component of the noise that
is orthogonal to that linear subspace. We then generalize the projection
operator also to the cases where the set of TDOAs is incomplete. We
analytically show that this operator improves the localization accuracy, and we
further confirm that via simulation.Comment: 25 pages, 9 figure
- …