6,921 research outputs found
Comparison of Binaural RTF-Vector-Based Direction of Arrival Estimation Methods Exploiting an External Microphone
In this paper we consider a binaural hearing aid setup, where in addition to
the head-mounted microphones an external microphone is available. For this
setup, we investigate the performance of several relative transfer function
(RTF) vector estimation methods to estimate the direction of arrival(DOA) of
the target speaker in a noisy and reverberant acoustic environment. More in
particular, we consider the state-of-the-art covariance whitening (CW) and
covariance subtraction (CS) methods, either incorporating the external
microphone or not, and the recently proposed spatial coherence (SC) method,
requiring the external microphone. To estimate the DOA from the estimated RTF
vector, we propose to minimize the frequency-averaged Hermitian angle between
the estimated head-mounted RTF vector and a database of prototype head-mounted
RTF vectors. Experimental results with stationary and moving speech sources in
a reverberant environment with diffuse-like noise show that the SC method
outperforms the CS method and yields a similar DOA estimation accuracy as the
CW method at a lower computational complexity.Comment: Submitted to EUSIPCO 202
Influence of Lossy Speech Codecs on Hearing-aid, Binaural Sound Source Localisation using DNNs
Hearing aids are typically equipped with multiple microphones to exploit
spatial information for source localisation and speech enhancement. Especially
for hearing aids, a good source localisation is important: it not only guides
source separation methods but can also be used to enhance spatial cues,
increasing user-awareness of important events in their surroundings. We use a
state-of-the-art deep neural network (DNN) to perform binaural
direction-of-arrival (DoA) estimation, where the DNN uses information from all
microphones at both ears. However, hearing aids have limited bandwidth to
exchange this data. Bluetooth low-energy (BLE) is emerging as an attractive
option to facilitate such data exchange, with the LC3plus codec offering
several bitrate and latency trade-off possibilities. In this paper, we
investigate the effect of such lossy codecs on localisation accuracy.
Specifically, we consider two conditions: processing at one ear vs processing
at a central point, which influences the number of channels that need to be
encoded. Performance is benchmarked against a baseline that allows full
audio-exchange - yielding valuable insights into the usage of DNNs under lossy
encoding. We also extend the Pyroomacoustics library to include hearing-device
and head-related transfer functions (HD-HRTFs) to suitably train the networks.
This can also benefit other researchers in the field
- …