Search CORE

188 research outputs found

An evaluation of intrusive instrumental intelligibility metrics

Author: Hendriks Richard C.
Kleijn W. Bastiaan
Van Kuyk Steven
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and

\text{sEPSM}^\text{corr}

. In addition, this paper investigates the ability of intelligibility metrics to generalize to new types of distortions and analyzes why the top performing metrics have high performance. The intelligibility data were obtained from 11 listening tests described in the literature. The stimuli included Dutch, Danish, and English speech that was distorted by additive noise, reverberation, competing talkers, pre-processing enhancement, and post-processing enhancement. SIIB and HASPI had the highest performance achieving a correlation with listening test scores on average of

\rho=0.92

and

\rho=0.89

, respectively. The high performance of SIIB may, in part, be the result of SIIBs developers having access to all the intelligibility data considered in the evaluation. The results show that intelligibility metrics tend to perform poorly on data sets that were not used during their development. By modifying the original implementations of SIIB and STOI, the advantage of reducing statistical dependencies between input features is demonstrated. Additionally, the paper presents a new version of SIIB called

\text{SIIB}^\text{Gauss}

, which has similar performance to SIIB and HASPI, but takes less time to compute by two orders of magnitude.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 201

arXiv.org e-Print Archive

Effects of intrinsic and imposed modulation masking on speech perception

Author: Curetti Lorenza Zaira
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

Modeling auditory processing and speech perception in hearing-impaired listeners

Author: Jepsen Morten Løve
Publication venue: Technical University of Denmark
Publication date: 01/06/2010
Field of study

Online Research Database In Technology

Recommended from our members

Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation.

Author: Ewert Stephan D
Lorenzi Christian
Moore Brian CJ
Wallaert Nicolas
Publication venue: Journal of the Acoustical Society of America
Publication date: 22/02/2017
Field of study

Amplitude-modulation detection thresholds (AMDTs) were measured at 40 dB sensation level for listeners with mild-to-moderate sensorineural hearing loss (age: 50-64 yr) for a carrier frequency of 500 Hz and rates of 2 and 20 Hz. The number of modulation cycles, N, varied between two and nine. The data were compared with AMDTs measured for young and older normal-hearing listeners [Wallaert, Moore, and Lorenzi (2016). J. Acoust. Soc. Am. 139, 3088-3096]. As for normal-hearing listeners, AMDTs were lower for the 2-Hz than for the 20-Hz rate, and AMDTs decreased with increasing N. AMDTs were lower for hearing-impaired listeners than for normal-hearing listeners, and the effect of increasing N was greater for hearing-impaired listeners. A computational model based on the modulation-filterbank concept and a template-matching decision strategy was developed to account for the data. The psychophysical and simulation data suggest that the loss of amplitude compression in the impaired cochlea is mainly responsible for the enhanced sensitivity and temporal integration of temporal envelope cues found for hearing-impaired listeners. The data also suggest that, for AM detection, cochlear damage is associated with increased internal noise, but preserved short-term memory and decision mechanisms.N.W. was supported by a grant from Neurelec Oticon Medical. C.L. was supported by two grants from ANR (HEARFIN and HEART projects). S.D.E. was supported by Deutsche Forschungsgemeinschaft (DFG) FOR 1732 (TPE). B.C.J.M. was supported by the EPSRC (UK, grant RG78536). This work was also supported by ANR-11-0001-02 PSL* and ANR-10-LABX-0087. We thank Nihaad Paraouty and two anonymous reviewers for helpful comments and suggestions relating to this study

Apollo (Cambridge)

Listening in adverse conditions: Masking release and effects of hearing loss

Author: Jespersgaard Claus Forup Corlin
Publication venue: Technical University of Denmark
Publication date: 01/01/2012
Field of study

Online Research Database In Technology

Assessing speech intelligibility in hearing impaired listeners

Author: Scheidiger Christoph
Publication venue
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

Learning static spectral weightings for speech intelligibility enhancement in noise

Author: Cooke M
Tang Y
Publication venue: 'Elsevier BV'
Publication date: 01/05/2018
Field of study

Near-end speech enhancement works by modifying speech prior to presentation in a noisy environment, typically operating under a constraint of limited or no increase in speech level. One issue is the extent to which near-end enhancement techniques require detailed estimates of the masking environment to function effectively. The current study investigated speech modification strategies based on reallocating energy statically across the spectrum using masker-specific spectral weightings. Weighting patterns were learned offline by maximising a glimpse-based objective intelligibility metric. Keyword scores in sentences in the presence of stationary and fluctuating maskers increased, in some cases by very substantial amounts, following the application of masker- and SNR-specific spectral weighting. A second experiment using generic masker-independent spectral weightings that boosted all frequencies above 1 kHz also led to significant gains in most conditions. These findings indicate that energy-neutral spectral weighting is a highly-effective near-end speech enhancement approach that places minimal demands on detailed masker estimation

University of Salford Institutional Repository

Crossref

The Extended Speech Transmission Index:Predicting speech intelligibility in non-stationary noise and reverberation

Author: van Schoonhoven J.
Publication venue
Publication date: 01/01/2023
Field of study

International Migration, Integration and Social Cohesion online publications

Predicting binaural speech intelligibility from signals estimated by a blind source separation algorithm

Author: Jackson PJB
Liu Q
Tang Y
Wang W
Publication venue: 'International Speech Communication Association'
Publication date: 08/09/2016
Field of study

State-of-the-art binaural objective intelligibility measures (OIMs) require individual source signals for making intelligibility predictions, limiting their usability in real-time online operations. This limitation may be addressed by a blind source separation (BSS) process, which is able to extract the underlying sources from a mixture. In this study, a speech source is presented with either a stationary noise masker or a fluctuating noise masker whose azimuth varies in a horizontal plane, at two speech-to-noise ratios (SNRs). Three binaural OIMs are used to predict speech intelligibility from the signals separated by a BSS algorithm. The model predictions are compared with listeners' word identification rate in a perceptual listening experiment. The results suggest that with SNR compensation to the BSS-separated speech signal, the OIMs can maintain their predictive power for individual maskers compared to their performance measured from the direct signals. It also reveals that the errors in SNR between the estimated signals are not the only factors that decrease the predictive accuracy of the OIMs with the separated signals. Artefacts or distortions on the estimated signals caused by the BSS algorithm may also be concerns

University of Salford Institutional Repository

Crossref

University of Surrey

Surrey Research Insight

Auditory sensory saliency as a better predictor of change than sound amplitude in pleasantness assessment of reproduced urban soundscapes

Author: Aurnond Pierre
Botteldooren Dick
Can Arnaud
De Coensel Bert
Filipan Karlo
Lavandier Catherine
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

The sonic environment of the urban public space is often experienced while walking through it. Nevertheless, city dwellers are usually not actively listening to the environment when traversing the city. Therefore, sound events that are salient, i.e. stand out of the sonic environment, are the ones that trigger attention and contribute highly to the perception of the soundscape. In a previously reported audiovisual perception experiment, the pleasantness of a recorded urban sound walk was continuously evaluated by a group of participants. To detect salient events in the soundscape, a biologically-inspired computational model for auditory sensory saliency based on spectrotemporal modulations is proposed. Using the data from a sound walk, the present study validates the hypothesis that salient events detected by the model contribute to changes in soundscape rating and are therefore important when evaluating the urban soundscape. Finally, when using the data from an additional experiment without a strong visual component, the importance of auditory sensory saliency as a predictor for change in pleasantness assessment is found to be even more pronounced

Ghent University Academic Bibliography