34 research outputs found

    Temporal Fine-Structure Coding and Lateralized Speech Perception in Normal-Hearing and Hearing-Impaired Listeners

    Get PDF
    This study investigated the relationship between speech perception performance in spatially complex, lateralized listening scenarios and temporal fine-structure (TFS) coding at low frequencies. Young normal-hearing (NH) and two groups of elderly hearing-impaired (HI) listeners with mild or moderate hearing loss above 1.5 kHz participated in the study. Speech reception thresholds (SRTs) were estimated in the presence of either speech-shaped noise, two-, four-, or eight-talker babble played reversed, or a nonreversed two-talker masker. Target audibility was ensured by applying individualized linear gains to the stimuli, which were presented over headphones. The target and masker streams were lateralized to the same or to opposite sides of the head by introducing 0.7-ms interaural time differences between the ears. TFS coding was assessed by measuring frequency discrimination thresholds and interaural phase difference thresholds at 250 Hz. NH listeners had clearly better SRTs than the HI listeners. However, when maskers were spatially separated from the target, the amount of SRT benefit due to binaural unmasking differed only slightly between the groups. Neither the frequency discrimination threshold nor the interaural phase difference threshold tasks showed a correlation with the SRTs or with the amount of masking release due to binaural unmasking, respectively. The results suggest that, although HI listeners with normal hearing thresholds below 1.5 kHz experienced difficulties with speech understanding in spatially complex environments, these limitations were unrelated to TFS coding abilities and were only weakly associated with a reduction in binaural-unmasking benefit for spatially separated competing sources

    The role of envelope periodicity in the perception of masked speech with simulated and real cochlear implants

    Get PDF
    In normal hearing, complex tones with pitch-related periodic envelope modulations are far less effective maskers of speech than aperiodic noise. Here, it is shown that this masker-periodicity benefit is diminished in noise-vocoder simulations of cochlear implants (CIs) and further reduced with real CIs. Nevertheless, both listener groups still benefitted significantly from masker periodicity, despite the lack of salient spectral pitch cues. The main reason for the smaller effect observed in CI users is thought to be an even stronger channel interaction than in the CI simulations, which smears out the random envelope modulations that are characteristic for aperiodic sounds. In contrast, neither interferers that were amplitude-modulated at a rate of 10 Hz nor maskers with envelopes specifically designed to reveal the target speech enabled a masking release in CI users. Hence, even at the high signal-to-noise ratios at which they were tested, CI users can still exploit pitch cues transmitted by the temporal envelope of a non-speech masker, whereas slow amplitude modulations of the masker envelope are no longer helpful

    The role of acoustic periodicity in perceiving speech

    Get PDF
    This thesis investigated the role of one important acoustic feature, periodicity, in the perception of speech. In the context of this thesis, periodicity denotes that a speech sound is voiced, giving rise to a sonorous sound quality sharply opposed to that of noisy unvoiced sounds. In a series of behavioural and electroencephalography (EEG) experiments, it was tested how the presence and absence of periodicity in both target speech and background noise affects the ability to understand speech, and its cortical representation. Firstly, in quiet listening conditions, speech with a natural amount of periodicity and completely aperiodic speech were equally intelligible, while completely periodic speech was much harder to understand. In the presence of a masker, however, periodicity in the target speech mattered little. In contrast, listeners substantially benefitted from periodicity in the masker and this socalled masker-periodicity benefit (MPB) was about twice as large as the fluctuatingmasker benefit (FMB) obtained from masker amplitude modulations. Next, cortical EEG responses to the same three target speech conditions were recorded. In an attempt to isolate effects of periodicity and intelligibility, the trials were sorted according to the correctness of the listeners’ spoken responses. More periodicity rendered the event-related potentials more negative during the first second after sentence onset, while a slow negativity was observed when the sentences were more intelligible. Additionally, EEG alpha power (7–10 Hz) was markedly increased before the least intelligible sentences. This finding is taken to indicate that the listeners have not been fully focussed on the task before these trials. The same EEG data were also analysed in the frequency domain, which revealed a distinct response pattern, with more theta power (5–6.3 Hz) and a trend for less beta power (11–18 Hz), in the fully periodic condition, but again no differences between the other two conditions. This pattern may indicate that the subjects internally rehearsed the sentences in the periodic condition before they verbally responded. Crucially, EEG power in the delta range (1.7–2.7 Hz) was substantially increased during the second half of intelligible sentences, when compared to their unintelligible counterparts. Lastly, effects of periodicity in the perception of speech in noise were examined in simulations of cochlear implants (CIs). Although both were substantially reduced, the MPB was still about twice as large as the FMB, highlighting the robustness of periodicity cues, even with the limited access to spectral information provided by simulated CIs. On the other hand, the larger absolute reduction of the MBP compared to normal-hearing also suggests that the inability to exploit periodicity cues may be an even more important factor in explaining the poor performance of CI users than the inability to benefit from masker fluctuations

    The musician effect:does it persist under degraded pitch conditions of cochlear implant simulations?

    Get PDF
    Cochlear implants (CIs) are auditory prostheses that restore hearing via electrical stimulation of the auditory nerve. Compared to normal acoustic hearing, sounds transmitted through the CI are spectro-temporally degraded, causing difficulties in challenging listening tasks such as speech intelligibility in noise and perception of music. In normal hearing (NH), musicians have been shown to better perform than non-musicians in auditory processing and perception, especially for challenging listening tasks. This "musician effect" was attributed to better processing of pitch cues, as well as better overall auditory cognitive functioning in musicians. Does the musician effect persist when pitch cues are degraded, as it would be in signals transmitted through a CI? To answer this question, NH musicians and non-musicians were tested while listening to unprocessed signals or to signals processed by an acoustic CI simulation. The task increasingly depended on pitch perception: (1) speech intelligibility (words and sentences) in quiet or in noise, (2) vocal emotion identification, and (3) melodic contour identification (MCI). For speech perception, there was no musician effect with the unprocessed stimuli, and a small musician effect only for word identification in one noise condition, in the CI simulation. For emotion identification, there was a small musician effect for both. For MCI, there was a large musician effect for both. Overall, the effect was stronger as the importance of pitch in the listening task increased. This suggests that the musician effect may be more rooted in pitch perception, rather than in a global advantage in cognitive processing (in which musicians would have performed better in all tasks). The results further suggest that musical training before (and possibly after) implantation might offer some advantage in pitch processing that could partially benefit speech perception, and more strongly emotion and music perception

    Listening to speech in a background of other talkers: effects of talker number and noise vocoding

    Get PDF
    Some of the most common interfering background sounds a listener experiences are the sounds of other talkers. In Experiment 1, recognition for natural Institute of Electrical and Electronics Engineers (IEEE) sentences was measured in normal-hearing adults at two fixed signal-to-noise ratios (SNRs) in 16 backgrounds with the same long-term spectrum: unprocessed speech babble (1, 2, 4, 8, and 16 talkers), noise-vocoded versions of the babbles (12 channels), noise modulated with the wide-band envelope of the speech babbles, and unmodulated noise. All talkers were adult males. For a given number of talkers, natural speech was always the most effective masker. The greatest changes in performance occurred as the number of talkers in the maskers increased from 1 to 2 or 4, with small changes thereafter. In Experiment 2, the same targets and maskers (1, 2, and 16 talkers) were used to measure speech reception thresholds (SRTs) adaptively. Periodicity in the target was also manipulated by noise-vocoding, which led to considerably higher SRTs. The greatest masking effect always occurred for the masker type most similar to the target, while the effects of the number of talkers were generally small. Implications are drawn with reference to glimpsing, informational vs energetic masking, overall SNR, and aspects of periodicity

    Speech-in-speech perception, non-verbal selective attention, and musical training

    Get PDF
    Speech is more difficult to understand when it is presented concurrently with a distractor speech stream. One source of this difficulty is that competing speech can act as an attentional lure, requiring listeners to exert attentional control to ensure that attention does not drift away from the target. Stronger attentional control may enable listeners to more successfully ignore distracting speech, and so individual differences in selective attention may be one factor driving the ability to perceive speech in complex environments. However, the lack of a paradigm for measuring non-verbal sustained selective attention to sound has made this hypothesis difficult to test. Here we find that individuals who are better able to attend to a stream of tones and respond to occasional repeated sequences while ignoring a distractor tone stream are also better able to perceive speech masked by a single distractor talker. We also find that participants who have undergone more musical training show better performance on both verbal and non-verbal selective attention tasks, and this musician advantage is greater in older participants. This suggests that one source of a potential musician advantage for speech perception in complex environments may be experience or skill in directing and maintaining attention to a single auditory object

    Effect of Prolonged Non-Traumatic Noise Exposure on Unvoiced Speech Recognition

    Get PDF
    Animal models in the past decade have shown that noise exposure may affect temporal envelope processing at supra-threshold levels while the absolute hearing threshold remains in the normal range. However, human studies have failed to consistently find such issue due to poor control of the participants’ noise exposure history and the measure sensitivity. The current study operationally defined non-traumatic noise exposure (NTNE) to be noise exposure at dental schools because of its distinctive high-pass spectral feature, non-traumatic nature, and systematic exposure schedule across dental students of different years. Temporal envelope processing was examined through unvoiced speech recognition interrupted by noise or by silence. The results showed that people who had systematic exposure to dental noise performed more poorly on tasks of temporal envelope processing than the exposed people. The effect of high-frequency NTNE on temporal envelope processing was more robust inside than outside the spectral band of dental noise and was more obvious in conditions that required finer temporal resolution (e.g faster noise modulation rate) than in those requiring less fine temporal resolution (e.g. slower noise modulation rate). Furthermore, there was a significant performance difference between the exposed and the unexposed groups on tasks of spectral envelope processing at low frequency. Meanwhile, the two groups performed similarly in tasks near threshold. Additional analyses showed that factors such as age, years of musical training, non-dental noise exposure history and peripheral auditory function were not able to explain the variance of the performance in tasks of temporal or spectral envelope processing. The findings from the current study support the general assumptions from animal models of NTNE that temporal and spectral envelope processing issues related to NTNE likely occur in retro-cochlear sites, at supra-threshold levels, and could be easily overlooked by clinically routine audiologic screening

    Age Effects on Perceptual Organization of Speech in Realistic Environments

    Get PDF
    Communication often occurs in environments where background sounds fluctuate and mask portions of the intended message. Listeners use envelope and periodicity cues to group together audible glimpses of speech and fill in missing information. When the background contains other talkers, listeners also use focused attention to select the appropriate target talker and ignore competing talkers. Whereas older adults are known to experience significantly more difficulty with these challenging tasks than younger adults, the sources of these difficulties remain unclear. In this project, three related experiments explored the effects of aging on several aspects of speech understanding in realistic listening environments. Experiments 1 and 2 determined the extent to which aging affects the benefit of envelope and periodicity cues for recognition of short glimpses of speech, phonemic restoration of missing speech segments, and/or segregation of glimpses with a competing talker. Experiment 3 investigated effects of age on the ability to focus attention on an expected voice in a two-talker environment. Twenty younger adults and 20 older adults with normal hearing participated in all three experiments and also completed a battery of cognitive measures to examine contributions from specific cognitive abilities to speech recognition. Keyword recognition and cognitive data were analyzed with an item-level logistic regression based on a generalized linear mixed model. Results indicated that older adults were poorer than younger adults at glimpsing short segments of speech but were able use envelope and periodicity cues to facilitate phonemic restoration and speech segregation. Whereas older adults performed poorer than younger adults overall, these groups did not differ in their ability to focus attention on an expected voice. Across all three experiments, older adults were poorer than younger adults at recognizing speech from a female talker both in quiet and with a competing talker. Results of cognitive tasks indicated that faster processing speed and better visual-linguistic closure were predictive of better speech understanding. Taken together these results suggest that age-related declines in speech recognition may be partially explained by difficulty grouping short glimpses of speech into a coherent message, which may be particularly difficult for older adults when the talker is female

    Exploiting primitive grouping constraints for noise robust automatic speech recognition : studies with simultaneous speech.

    Get PDF
    Significant strides have been made in the field of automatic speech recognition over the past three decades. However, the systems are not robust; their performance degrades in the presence of even moderate amounts of noise. This thesis presents an approach to developing a speech recognition system that takes inspiration firom the approach of human speech recognition
    corecore