147 research outputs found
The impact of exploiting spectro-temporal context in computational speech segregation
The experimental data from the study:
https://asa.scitation.org/doi/10.1121/1.5020273
Group 1 contains results, masks and audio from the models of the 16 GMM component segregation system
Group 2 contains results, masks and audio from the models of the 64 GMM component segregation system
There are three folders:
Audio:
The CLUE sentences that were used for the listener study
IBM = Ideal Binary Mask, UP = UnProcessed, EBM = Estimated Binary Mask.
The IBM and UP are stored in one of the configuration folders (Front-end), that is:
Audio\Group1\Front-end\icra_01_10sec_matched\UP
Audio\Group1\Front-end\icra_01_10sec_matched\IBM
Audio\Group1\Front-end\icra_01_10sec_matched\EBM
Results:
The computed metrics for group 1 & 2 as well as Word Recognition Scores (WRSs) from the listener study
BinaryMasks:
a priori SNR masks, IBMs and EBMs from group 1 and 2.
Developed with Matlab R2016a
Prediction of speech intelligibility based on a correlation metric in the envelope power spectrum domain
A powerful tool to investigate speech perception is the use of speech intelligibility prediction models. Recently, a model was presented, termed correlation-based speechbased envelope power spectrum model (sEPSMcorr) [1], based on the auditory processing of the multi-resolution speech-based Envelope Power Spectrum Model (mr-sEPSM) [2], combined with the correlation back-end of the Short-Time Objective Intelligibility measure (STOI) [3]. The sEPSMcorr can accurately predict NH data for a broad range of listening conditions, e.g., additive noise, phase jitter and ideal binary mask processing
Signal-to-Noise-Ratio-Aware Dynamic Range Compression in Hearing Aids
Fast-acting dynamic range compression is a level-dependent amplification scheme which aims to restore audibility for hearing-impaired listeners. However, when being applied to noisy speech at positive signal-to-noise ratios (SNRs), the gain function typically changes rapidly over time as it is driven by the short-term fluctuations of the speech signal. This leads to an amplification of the noise components in the speech gaps, which reduces the output SNR and distorts the acoustic properties of the background noise. An adaptive compression scheme is proposed here which utilizes information about the SNR in different frequency channels to adaptively change the characteristics of the compressor. Specifically, fast-acting compression is applied to speech-dominated time-frequency (T-F) units where the SNR is high, while slow-acting compression is used to effectively linearize the processing for noise-dominated T-F units where the SNR is low. A systematic evaluation of this SNR-aware compression scheme showed that the effective compression of speech components embedded in noise was similar to that of a conventional fast-acting system, whereas natural fluctuations in the background noise were preserved in a similar way as when a slow-acting compressor was applied
Effects of slow- and fast-acting compression on hearing impaired listeners’ consonant-vowel identification in interrupted noise
There is conflicting evidence about the relative benefit of slow- and fast-acting compression for speech intelligibility. It has been hypothesized that fast-acting compression improves audibility at low signal-to-noise ratios (SNRs) but may distort the speech envelope at higher SNRs. The present study investigated the effects of compression with a nearly instantaneous attack time but either fast (10 ms) or slow (500 ms) release times on consonant identification in hearing-impaired listeners. Consonant–vowel speech tokens were presented at a range of presentation levels in two conditions: in the presence of interrupted noise and in quiet (with the compressor “shadow-controlled” by the corresponding mixture of speech and noise). These conditions were chosen to disentangle the effects of consonant audibility and noise-induced forward masking on speech intelligibility. A small but systematic intelligibility benefit of fast-acting compression was found in both the quiet and the noisy conditions for the lower speech levels. No detrimental effects of fast-acting compression were observed when the speech level exceeded the level of the noise. These findings suggest that fast-acting compression provides an audibility benefit in fluctuating interferers when compared with slow-acting compression while not substantially affecting the perception of consonants at higher SNRs
- …