Search CORE

3,171 research outputs found

The Lombard intelligibility benefit of native and non-native speech for native and non-native listeners

Author: Cooke M.
Ernestus M.
Marcoux K.
Tucker B.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

Speech produced in noise (Lombard speech) is more intelligible than speech produced in quiet (plain speech). Previous research on the Lombard intelligibility benefit focused almost entirely on how native speakers produce and perceive Lombard speech. In this study, we investigate the size of the Lombard intelligibility benefit of both native (American-English) and non-native (native Dutch) English for native and non-native listeners (Dutch and Spanish). We used a glimpsing metric to measure the energetic masking potential of speech, which predicted that both native and non-native Lombard speech could withstand greater amounts of masking to a similar extent, compared to plain speech. In an intelligibility experiment, native English, Spanish, and Dutch listeners listened to the same words, mixed with noise. While the non-native listeners appeared to benefit more from Lombard speech than the native listeners did, each listener group experienced a similar benefit for native and non-native Lombard speech. Energetic masking, as captured by the glimpsing metric, only accounted for part of the Lombard benefit, indicating that the Lombard intelligibility benefit does not only result from a shift in spectral distribution. Despite subtle native language influences on non-native Lombard speech, both native and non-native speech provides a Lombard benefit

Radboud Repository

MPG.PuRe

Enhanced amplitude modulations contribute to the Lombard intelligibility benefit: Evidence from the Nijmegen Corpus of Lombard Speech

Author: Bosker H.
Cooke M.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 03/02/2020
Field of study

Speakers adjust their voice when talking in noise, which is known as Lombard speech. These acoustic adjustments facilitate speech comprehension in noise relative to plain speech (i.e., speech produced in quiet). However, exactly which characteristics of Lombard speech drive this intelligibility benefit in noise remains unclear. This study assessed the contribution of enhanced amplitude modulations to the Lombard speech intelligibility benefit by demonstrating that (1) native speakers of Dutch in the Nijmegen Corpus of Lombard Speech (NiCLS) produce more pronounced amplitude modulations in noise vs. in quiet; (2) more enhanced amplitude modulations correlate positively with intelligibility in a speech-in-noise perception experiment; (3) transplanting the amplitude modulations from Lombard speech onto plain speech leads to an intelligibility improvement, suggesting that enhanced amplitude modulations in Lombard speech contribute towards intelligibility in noise. Results are discussed in light of recent neurobiological models of speech perception with reference to neural oscillators phase-locking to the amplitude modulations in speech, guiding the processing of speech

MPG.PuRe

No evidence for a benefit from masker harmonicity in the perception of speech in noise

Author: Rosen S
Steinmetzger K
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/02/2023
Field of study

When assessing the intelligibility of speech embedded in background noise, maskers with a harmonic spectral structure have been found to be much less detrimental to performance than noise-based interferers. While spectral "glimpsing"in between the resolved masker harmonics and reduced envelope modulations of harmonic maskers have been shown to contribute, this effect has primarily been attributed to the proposed ability of the auditory system to cancel harmonic maskers from the signal mixture. Here, speech intelligibility in the presence of harmonic and inharmonic maskers with similar spectral glimpsing opportunities and envelope modulation spectra was assessed to test the theory of harmonic cancellation. Speech reception thresholds obtained from normal-hearing listeners revealed no effect of masker harmonicity, neither for maskers with static nor dynamic pitch contours. The results show that harmonicity, or time-domain periodicity, as such, does not aid the segregation of speech and masker. Contrary to what might be assumed, this also implies that the saliency of the masker pitch did not affect auditory grouping. Instead, the current data suggest that the reduced masking effectiveness of harmonic sounds is due to the regular spacing of their spectral components

UCL Discovery

Using an intelligibility measure to create noise robust cepstral coefficients for HMM-based speech synthesis

Author: King S.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/05/2012
Field of study

Edinburgh Research Explorer

Factors affecting the development of speech recognition in steady and modulated noise

Author: Buss Emily
Grose John H.
Hall Joseph W.
Publication venue
Publication date: 01/01/2016
Field of study

This study used a checkerboard-masking paradigm to investigate the development of the speech reception threshold (SRT) for monosyllabic words in synchronously and asynchronously modulated noise. In asynchronous modulation, masker frequencies below 1300 Hz were gated off when frequencies above 1300 Hz were gated on, and vice versa. The goals of the study were to examine development of the ability to use asynchronous spectro-temporal cues for speech recognition and to assess factors related to speech frequency region and audible speech bandwidth. A speech-shaped noise masker was steady or was modulated synchronously or asynchronously across frequency. Target words were presented to 5–7 year old children or to adults. Overall, children showed higher SRTs and smaller masking release than adults. Consideration of the present results along with previous findings supports the idea that children can have particularly poor masked SRTs when the speech and masker spectra differ substantially, and that this may arise due to children requiring a wider speech bandwidth than adults for speech recognition. The results were also consistent with the idea that children are relatively poor in integrating speech cues when the frequency regions with the best signal-to-noise ratios vary across frequency as a function of time

Carolina Digital Repository

Effect of audibility on spatial release from speech-on-speech masking

Author: ANSI S3.5
Bamiou D. E.
Blauert J.
Harvey Dillon
Helen Glyde
Jörg M. Buchholz
Kidd G.
Lillian Nielsen
Louise Hickson
Sharon Cameron
Virginia Best
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2015
Field of study

This study investigated to what extent spatial release from masking (SRM) deficits in hearing-impaired adults may be related to reduced audibility of the test stimuli. Sixteen adults with sensorineural hearing loss and 28 adults with normal hearing were assessed on the Listening in Spatialized Noise–Sentences test, which measures SRM using a symmetric speech-on-speech masking task. Stimuli for the hearing-impaired listeners were delivered using three amplification levels (National Acoustic Laboratories - Revised Profound prescription (NAL-RP) +25%, and NAL-RP +50%), while stimuli for the normal-hearing group were filtered to achieve matched audibility. SRM increased as audibility increased for all participants. Thus, it is concluded that reduced audibility of stimuli may be a significant factor in hearing-impaired adults' reduced SRM even when hearing loss is compensated for with linear gain. However, the SRM achieved by the normal hearers with simulated audibility loss was still significantly greater than that achieved by hearing-impaired listeners, suggesting other factors besides audibility may still play a role

Crossref

The University of Manchester - Institutional Repository

Macquarie University ResearchOnline

University of Queensland eSpace

A metric for predicting binaural speech intelligibility in stationary noise and competing speech maskers

Author: Bruno M. Fazenda
Cooke M.
Durlach N. I.
Falk T.
IEC 60268-16:2011
Mapp P.
Martin Cooke
Rothauser E. H.
Sauert B.
Sonnenscheinn D.
Taal C. H.
Tang Y.
Trevor J. Cox
Wierstorf H.
Yan Tang
Zurek P. M.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 21/09/2016
Field of study

One criterion in the design of binaural sound scenes in audio production is the extent to which the intended speech message is correctly understood. Object-based audio broadcasting systems have permitted sound editors to gain more access to the metadata (e.g., intensity and location) of each sound source, providing better control over speech intelligibility. The current study describes and evaluates a binaural distortion-weighted glimpse proportion metric -- BiDWGP -- which is motivated by better-ear glimpsing and binaural masking level differences. BiDWGP predicts intelligibility from two alternative input forms: either binaural recordings or monophonic recordings from each sound source along with their locations. Two listening experiments were performed with stationary noise and competing speech, one in the presence of a single masker, the other with multiple maskers, for a variety of spatial conﬁgurations. Overall, BiDWGP with both input forms predicts listener keyword scores with correlations of 0.95 and 0.91 for single- and multi-masker conditions, respectively. When considering masker type separately, correlations rise to 0.95 and above for both types of maskers. Predictions using the two input forms are very similar, suggesting that BiDWGP can be applied to the design of sound scenes where only individual sound sources and their locations are available

University of Salford Institutional Repository

Crossref

Effects of Simulated and Profound Unilateral Sensorineural Hearing Loss on Recognition of Speech in Competing Speech

Author: Agrawal
Agterberg
Asp
Asp
Berger
Berninger
Bernstein
Bronkhorst
Bronkhorst
Bronkhorst
Bronkhorst
Brungart
Buss
Cherry
Chiossoine-Kerdel
Corbin
Culling
Durlach
Dwyer
Edmonds
Festen
Firszt
Füllgrabe
Gallun
Gardner
Gatehouse
Glyde
Glyde
Grothe
Hagerman
Hagerman
Hagerman
Hawley
Hawley
Hygge
Ihlefeld
Jakien
Johansson
Kacelnik
Kidd
Kumpik
Marrone
Marrone
Middlebrooks
Moller
Newman
Pavlovic
Peters
Plomp
Rothpletz
Schneider
Schoenmaker
Schooneveldt
Slattery
Srinivasan
Swaminathan
Tufts
Walton
Wightman
Yost
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2020
Field of study

OBJECTIVES: Unilateral hearing loss (UHL) is a condition as common as bilateral hearing loss in adults. Because of the unilaterally reduced audibility associated with UHL, binaural processing of sounds may be disrupted. As a consequence, daily tasks such as listening to speech in a background of spatially distinct competing sounds may be challenging. A growing body of subjective and objective data suggests that spatial hearing is negatively affected by UHL. However, the type and degree of UHL vary considerably in previous studies. The aim here was to determine the effect of a profound sensorineural UHL, and of a simulated UHL, on recognition of speech in competing speech, and the binaural and monaural contributions to spatial release from masking, in a demanding multisource listening environment. DESIGN: Nine subjects (25 to 61 years) with profound sensorineural UHL [mean pure-tone average (PTA) across 0.5, 1, 2, and 4 kHz = 105 dB HL] and normal contralateral hearing (mean PTA = 7.2 dB HL) were included based on the criterion that the target and competing speech were inaudible in the ear with hearing loss. Thirteen subjects with normal hearing (19 to 60 years; mean left PTA = 4.1 dB HL; mean right PTA = 5.5 dB HL) contributed data in normal and simulated "mild-to-moderate" UHL conditions (PTA = 38.6 dB HL). The main outcome measure was the threshold for 40% correct speech recognition in colocated (0\ub0) and spatially and symmetrically separated (\ub130\ub0 and \ub1150\ub0) competing speech conditions. Spatial release from masking was quantified as the threshold difference between colocated and separated conditions. RESULTS: Thresholds in profound UHL were higher (worse) than normal hearing in separated and colocated conditions, and comparable to simulated UHL. Monaural spatial release from masking, that is, the spatial release achieved by subjects with profound UHL, was significantly different from zero and 49% of the magnitude of the spatial release from masking achieved by subjects with normal hearing. There were subjects with profound UHL who showed negative spatial release, whereas subjects with normal hearing consistently showed positive spatial release from masking in the normal condition. The simulated UHL had a larger effect on the speech recognition threshold for separated than for colocated conditions, resulting in decreased spatial release from masking. The difference in spatial release between normal-hearing and simulated UHL conditions increased with age. CONCLUSIONS: The results demonstrate that while recognition of speech in colocated and separated competing speech is impaired for profound sensorineural UHL, spatial release from masking may be possible when competing speech is symmetrically distributed around the listener. A "mild-to-moderate" simulated UHL decreases spatial release from masking compared with normal-hearing conditions and interacts with age, indicating that small amounts of residual hearing in the UHL ear may be more beneficial for separated than for colocated interferer conditions for young listeners

Crossref

Chalmers Research