503 research outputs found
Comparison of Binaural RTF-Vector-Based Direction of Arrival Estimation Methods Exploiting an External Microphone
In this paper we consider a binaural hearing aid setup, where in addition to
the head-mounted microphones an external microphone is available. For this
setup, we investigate the performance of several relative transfer function
(RTF) vector estimation methods to estimate the direction of arrival(DOA) of
the target speaker in a noisy and reverberant acoustic environment. More in
particular, we consider the state-of-the-art covariance whitening (CW) and
covariance subtraction (CS) methods, either incorporating the external
microphone or not, and the recently proposed spatial coherence (SC) method,
requiring the external microphone. To estimate the DOA from the estimated RTF
vector, we propose to minimize the frequency-averaged Hermitian angle between
the estimated head-mounted RTF vector and a database of prototype head-mounted
RTF vectors. Experimental results with stationary and moving speech sources in
a reverberant environment with diffuse-like noise show that the SC method
outperforms the CS method and yields a similar DOA estimation accuracy as the
CW method at a lower computational complexity.Comment: Submitted to EUSIPCO 202
Relative Transfer Function Vector Estimation for Acoustic Sensor Networks Exploiting Covariance Matrix Structure
In many multi-microphone algorithms for noise reduction, an estimate of the
relative transfer function (RTF) vector of the target speaker is required. The
state-of-the-art covariance whitening (CW) method estimates the RTF vector as
the principal eigenvector of the whitened noisy covariance matrix, where
whitening is performed using an estimate of the noise covariance matrix. In
this paper, we consider an acoustic sensor network consisting of multiple
microphone nodes. Assuming uncorrelated noise between the nodes but not within
the nodes, we propose two RTF vector estimation methods that leverage the
block-diagonal structure of the noise covariance matrix. The first method
modifies the CW method by considering only the diagonal blocks of the estimated
noise covariance matrix. In contrast, the second method only considers the
off-diagonal blocks of the noisy covariance matrix, but cannot be solved using
a simple eigenvalue decomposition. When applying the estimated RTF vector in a
minimum variance distortionless response beamformer, simulation results for
real-world recordings in a reverberant environment with multiple noise sources
show that the modified CW method performs slightly better than the CW method in
terms of SNR improvement, while the off-diagonal selection method outperforms a
biased RTF vector estimate obtained as the principal eigenvector of the noisy
covariance matrix.Comment: Proc. IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics (WASPAA), New Paltz NY, USA, Oct. 202
RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field
Besides suppressing all undesired sound sources, an important objective of a
binaural noise reduction algorithm for hearing devices is the preservation of
the binaural cues, aiming at preserving the spatial perception of the acoustic
scene. A well-known binaural noise reduction algorithm is the binaural minimum
variance distortionless response beamformer, which can be steered using the
relative transfer function (RTF) vector of the desired source, relating the
acoustic transfer functions between the desired source and all microphones to a
reference microphone. In this paper, we propose a computationally efficient
method to estimate the RTF vector in a diffuse noise field, requiring an
additional microphone that is spatially separated from the head-mounted
microphones. Assuming that the spatial coherence between the noise components
in the head-mounted microphone signals and the additional microphone signal is
zero, we show that an unbiased estimate of the RTF vector can be obtained.
Based on real-world recordings, experimental results for several reverberation
times show that the proposed RTF estimator outperforms the widely used RTF
estimator based on covariance whitening and a simple biased RTF estimator in
terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201
Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates
This work addresses the problem of block-online processing for multi-channel
speech enhancement. Such processing is vital in scenarios with moving speakers
and/or when very short utterances are processed, e.g., in voice assistant
scenarios. We consider several variants of a system that performs beamforming
supported by DNN-based voice activity detection (VAD) followed by
post-filtering. The speaker is targeted through estimating relative transfer
functions between microphones. Each block of the input signals is processed
independently in order to make the method applicable in highly dynamic
environments. Owing to the short length of the processed block, the statistics
required by the beamformer are estimated less precisely. The influence of this
inaccuracy is studied and compared to the processing regime when recordings are
treated as one block (batch processing). The experimental evaluation of the
proposed method is performed on large datasets of CHiME-4 and on another
dataset featuring moving target speaker. The experiments are evaluated in terms
of objective and perceptual criteria (such as signal-to-interference ratio
(SIR) or perceptual evaluation of speech quality (PESQ), respectively).
Moreover, word error rate (WER) achieved by a baseline automatic speech
recognition system is evaluated, for which the enhancement method serves as a
front-end solution. The results indicate that the proposed method is robust
with respect to short length of the processed block. Significant improvements
in terms of the criteria and WER are observed even for the block length of 250
ms.Comment: 10 pages, 8 figures, 4 tables. Modified version of the article
accepted for publication in IET Signal Processing journal. Original results
unchanged, additional experiments presented, refined discussion and
conclusion
First Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Angular Power Spectrum
We present the angular power spectrum derived from the first-year Wilkinson
Microwave Anisotropy Probe (WMAP) sky maps. We study a variety of power
spectrum estimation methods and data combinations and demonstrate that the
results are robust. The data are modestly contaminated by diffuse Galactic
foreground emission, but we show that a simple Galactic template model is
sufficient to remove the signal. Point sources produce a modest contamination
in the low frequency data. After masking ~700 known bright sources from the
maps, we estimate residual sources contribute ~3500 uK^2 at 41 GHz, and ~130
uK^2 at 94 GHz, to the power spectrum l*(l+1)*C_l/(2*pi) at l=1000. Systematic
errors are negligible compared to the (modest) level of foreground emission.
Our best estimate of the power spectrum is derived from 28 cross-power spectra
of statistically independent channels. The final spectrum is essentially
independent of the noise properties of an individual radiometer. The resulting
spectrum provides a definitive measurement of the CMB power spectrum, with
uncertainties limited by cosmic variance, up to l~350. The spectrum clearly
exhibits a first acoustic peak at l=220 and a second acoustic peak at l~540 and
it provides strong support for adiabatic initial conditions. Kogut et al.
(2003) analyze the C_l^TE power spectrum, and present evidence for a relatively
high optical depth, and an early period of cosmic reionization. Among other
things, this implies that the temperature power spectrum has been suppressed by
\~30% on degree angular scales, due to secondary scattering.Comment: One of thirteen companion papers on first-year WMAP results submitted
to ApJ; 44 pages, 14 figures; a version with higher quality figures is also
available at http://lambda.gsfc.nasa.gov/product/map/map_bibliography.htm
Dual-Channel Speech Enhancement Based on Extended Kalman Filter Relative Transfer Function Estimation
This paper deals with speech enhancement in dual-microphone smartphones using
beamforming along with postfiltering techniques. The performance of these algorithms relies on
a good estimation of the acoustic channel and speech and noise statistics. In this work we present
a speech enhancement system that combines the estimation of the relative transfer function (RTF)
between microphones using an extended Kalman filter framework with a novel speech presence
probability estimator intended to track the noise statistics’ variability. The available dual-channel
information is exploited to obtain more reliable estimates of clean speech statistics. Noise reduction
is further improved by means of postfiltering techniques that take advantage of the speech presence
estimation. Our proposal is evaluated in different reverberant and noisy environments when the
smartphone is used in both close-talk and far-talk positions. The experimental results show that our
system achieves improvements in terms of noise reduction, low speech distortion and better speech
intelligibility compared to other state-of-the-art approaches.Spanish MINECO/FEDER Project TEC2016-80141-PSpanish
Ministry of Education through the National Program FPU under Grant FPU15/0416
- …