449 research outputs found
Covariance Blocking and Whitening Method for Successive Relative Transfer Function Vector Estimation in Multi-Speaker Scenarios
This paper addresses the challenge of estimating the relative transfer
function (RTF) vectors of multiple speakers in a noisy and reverberant
environment. More specifically, we consider a scenario where two speakers
activate successively. In this scenario, the RTF vector of the first speaker
can be estimated in a straightforward way and the main challenge lies in
estimating the RTF vector of the second speaker during segments where both
speakers are simultaneously active. To estimate the RTF vector of the second
speaker the so-called blind oblique projection (BOP) method determines the
oblique projection operator that optimally blocks the second speaker. Instead
of blocking the second speaker, in this paper we propose a covariance blocking
and whitening (CBW) method, which first blocks the first speaker and applies
whitening using the estimated noise covariance matrix and then estimates the
RTF vector of the second speaker based on a singular value decomposition. When
using the estimated RTF vectors of both speakers in a linearly constrained
minimum variance beamformer, simulation results using real-world recordings for
multiple speaker positions demonstrate that the proposed CBW method outperforms
the conventional BOP and covariance whitening methods in terms of
signal-to-interferer-and-noise ratio improvement.Comment: IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics (WASPAA), New Paltz, NY, USA, Oct 22-25, 202
Comparison of Binaural RTF-Vector-Based Direction of Arrival Estimation Methods Exploiting an External Microphone
In this paper we consider a binaural hearing aid setup, where in addition to
the head-mounted microphones an external microphone is available. For this
setup, we investigate the performance of several relative transfer function
(RTF) vector estimation methods to estimate the direction of arrival(DOA) of
the target speaker in a noisy and reverberant acoustic environment. More in
particular, we consider the state-of-the-art covariance whitening (CW) and
covariance subtraction (CS) methods, either incorporating the external
microphone or not, and the recently proposed spatial coherence (SC) method,
requiring the external microphone. To estimate the DOA from the estimated RTF
vector, we propose to minimize the frequency-averaged Hermitian angle between
the estimated head-mounted RTF vector and a database of prototype head-mounted
RTF vectors. Experimental results with stationary and moving speech sources in
a reverberant environment with diffuse-like noise show that the SC method
outperforms the CS method and yields a similar DOA estimation accuracy as the
CW method at a lower computational complexity.Comment: Submitted to EUSIPCO 202
RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field
Besides suppressing all undesired sound sources, an important objective of a
binaural noise reduction algorithm for hearing devices is the preservation of
the binaural cues, aiming at preserving the spatial perception of the acoustic
scene. A well-known binaural noise reduction algorithm is the binaural minimum
variance distortionless response beamformer, which can be steered using the
relative transfer function (RTF) vector of the desired source, relating the
acoustic transfer functions between the desired source and all microphones to a
reference microphone. In this paper, we propose a computationally efficient
method to estimate the RTF vector in a diffuse noise field, requiring an
additional microphone that is spatially separated from the head-mounted
microphones. Assuming that the spatial coherence between the noise components
in the head-mounted microphone signals and the additional microphone signal is
zero, we show that an unbiased estimate of the RTF vector can be obtained.
Based on real-world recordings, experimental results for several reverberation
times show that the proposed RTF estimator outperforms the widely used RTF
estimator based on covariance whitening and a simple biased RTF estimator in
terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201
First Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Angular Power Spectrum
We present the angular power spectrum derived from the first-year Wilkinson
Microwave Anisotropy Probe (WMAP) sky maps. We study a variety of power
spectrum estimation methods and data combinations and demonstrate that the
results are robust. The data are modestly contaminated by diffuse Galactic
foreground emission, but we show that a simple Galactic template model is
sufficient to remove the signal. Point sources produce a modest contamination
in the low frequency data. After masking ~700 known bright sources from the
maps, we estimate residual sources contribute ~3500 uK^2 at 41 GHz, and ~130
uK^2 at 94 GHz, to the power spectrum l*(l+1)*C_l/(2*pi) at l=1000. Systematic
errors are negligible compared to the (modest) level of foreground emission.
Our best estimate of the power spectrum is derived from 28 cross-power spectra
of statistically independent channels. The final spectrum is essentially
independent of the noise properties of an individual radiometer. The resulting
spectrum provides a definitive measurement of the CMB power spectrum, with
uncertainties limited by cosmic variance, up to l~350. The spectrum clearly
exhibits a first acoustic peak at l=220 and a second acoustic peak at l~540 and
it provides strong support for adiabatic initial conditions. Kogut et al.
(2003) analyze the C_l^TE power spectrum, and present evidence for a relatively
high optical depth, and an early period of cosmic reionization. Among other
things, this implies that the temperature power spectrum has been suppressed by
\~30% on degree angular scales, due to secondary scattering.Comment: One of thirteen companion papers on first-year WMAP results submitted
to ApJ; 44 pages, 14 figures; a version with higher quality figures is also
available at http://lambda.gsfc.nasa.gov/product/map/map_bibliography.htm
First Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Data Processing Methods and Systematic Errors Limits
We describe the calibration and data processing methods used to generate
full-sky maps of the cosmic microwave background (CMB) from the first year of
Wilkinson Microwave Anisotropy Probe (WMAP) observations. Detailed limits on
residual systematic errors are assigned based largely on analyses of the flight
data supplemented, where necessary, with results from ground tests. The data
are calibrated in flight using the dipole modulation of the CMB due to the
observatory's motion around the Sun. This constitutes a full-beam calibration
source. An iterative algorithm simultaneously fits the time-ordered data to
obtain calibration parameters and pixelized sky map temperatures. The noise
properties are determined by analyzing the time-ordered data with this sky
signal estimate subtracted. Based on this, we apply a pre-whitening filter to
the time-ordered data to remove a low level of 1/f noise. We infer and correct
for a small ~1% transmission imbalance between the two sky inputs to each
differential radiometer, and we subtract a small sidelobe correction from the
23 GHz (K band) map prior to further analysis. No other systematic error
corrections are applied to the data. Calibration and baseline artifacts,
including the response to environmental perturbations, are negligible.
Systematic uncertainties are comparable to statistical uncertainties in the
characterization of the beam response. Both are accounted for in the covariance
matrix of the window function and are propagated to uncertainties in the final
power spectrum. We characterize the combined upper limits to residual
systematic uncertainties through the pixel covariance matrix.Comment: One of 13 companion papers on first-year WMAP results submitted to
ApJ; 58 pages with 14 figures; a version with higher quality figures is at
http://lambda.gsfc.nasa.gov/product/map/map_bibliography.htm
Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates
This work addresses the problem of block-online processing for multi-channel
speech enhancement. Such processing is vital in scenarios with moving speakers
and/or when very short utterances are processed, e.g., in voice assistant
scenarios. We consider several variants of a system that performs beamforming
supported by DNN-based voice activity detection (VAD) followed by
post-filtering. The speaker is targeted through estimating relative transfer
functions between microphones. Each block of the input signals is processed
independently in order to make the method applicable in highly dynamic
environments. Owing to the short length of the processed block, the statistics
required by the beamformer are estimated less precisely. The influence of this
inaccuracy is studied and compared to the processing regime when recordings are
treated as one block (batch processing). The experimental evaluation of the
proposed method is performed on large datasets of CHiME-4 and on another
dataset featuring moving target speaker. The experiments are evaluated in terms
of objective and perceptual criteria (such as signal-to-interference ratio
(SIR) or perceptual evaluation of speech quality (PESQ), respectively).
Moreover, word error rate (WER) achieved by a baseline automatic speech
recognition system is evaluated, for which the enhancement method serves as a
front-end solution. The results indicate that the proposed method is robust
with respect to short length of the processed block. Significant improvements
in terms of the criteria and WER are observed even for the block length of 250
ms.Comment: 10 pages, 8 figures, 4 tables. Modified version of the article
accepted for publication in IET Signal Processing journal. Original results
unchanged, additional experiments presented, refined discussion and
conclusion
Dual-Channel Speech Enhancement Based on Extended Kalman Filter Relative Transfer Function Estimation
This paper deals with speech enhancement in dual-microphone smartphones using
beamforming along with postfiltering techniques. The performance of these algorithms relies on
a good estimation of the acoustic channel and speech and noise statistics. In this work we present
a speech enhancement system that combines the estimation of the relative transfer function (RTF)
between microphones using an extended Kalman filter framework with a novel speech presence
probability estimator intended to track the noise statisticsâ variability. The available dual-channel
information is exploited to obtain more reliable estimates of clean speech statistics. Noise reduction
is further improved by means of postfiltering techniques that take advantage of the speech presence
estimation. Our proposal is evaluated in different reverberant and noisy environments when the
smartphone is used in both close-talk and far-talk positions. The experimental results show that our
system achieves improvements in terms of noise reduction, low speech distortion and better speech
intelligibility compared to other state-of-the-art approaches.Spanish MINECO/FEDER Project TEC2016-80141-PSpanish
Ministry of Education through the National Program FPU under Grant FPU15/0416
- âŠ