59 research outputs found
A non-intrusive method for estimating binaural speech intelligibility from noise-corrupted signals captured by a pair of microphones
A non-intrusive method is introduced to predict binaural speech intelligibility in noise directly from signals captured using a pair of microphones. The approach combines signal processing techniques in blind source separation
and localisation, with an intrusive objective intelligibility measure (OIM). Therefore, unlike classic intrusive OIMs, this method does not require a clean reference speech signal and knowing the location of the sources to operate.
The proposed approach is able to estimate intelligibility in stationary and fluctuating noises, when the noise masker is presented as a point or diffused source, and is spatially separated from the target speech source on a horizontal
plane. The performance of the proposed method was evaluated in two rooms. When predicting subjective intelligibility measured as word recognition rate, this method showed reasonable predictive accuracy with correlation coefficients above 0.82, which is comparable to that of a reference intrusive OIM in most of the conditions. The proposed approach offers a solution for fast binaural intelligibility prediction, and therefore has practical potential
to be deployed in situations where on-site speech intelligibility is a concern
Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction
Non-intrusive intelligibility prediction is important for its application in
realistic scenarios, where a clean reference signal is difficult to access. The
construction of many non-intrusive predictors require either ground truth
intelligibility labels or clean reference signals for supervised learning. In
this work, we leverage an unsupervised uncertainty estimation method for
predicting speech intelligibility, which does not require intelligibility
labels or reference signals to train the predictor. Our experiments demonstrate
that the uncertainty from state-of-the-art end-to-end automatic speech
recognition (ASR) models is highly correlated with speech intelligibility. The
proposed method is evaluated on two databases and the results show that the
unsupervised uncertainty measures of ASR models are more correlated with speech
intelligibility from listening results than the predictions made by widely used
intrusive methods.Comment: Submitted to INTERSPEECH202
- …