430 research outputs found

    Computational modelling of neural mechanisms underlying natural speech perception

    Get PDF
    Humans are highly skilled at the analysis of complex auditory scenes. In particular, the human auditory system is characterized by incredible robustness to noise and can nearly effortlessly isolate the voice of a specific talker from even the busiest of mixtures. However, neural mechanisms underlying these remarkable properties remain poorly understood. This is mainly due to the inherent complexity of speech signals and multi-stage, intricate processing performed in the human auditory system. Understanding these neural mechanisms underlying speech perception is of interest for clinical practice, brain-computer interfacing and automatic speech processing systems. In this thesis, we developed computational models characterizing neural speech processing across different stages of the human auditory pathways. In particular, we studied the active role of slow cortical oscillations in speech-in-noise comprehension through a spiking neural network model for encoding spoken sentences. The neural dynamics of the model during noisy speech encoding reflected speech comprehension of young, normal-hearing adults. The proposed theoretical model was validated by predicting the effects of non-invasive brain stimulation on speech comprehension in an experimental study involving a cohort of volunteers. Moreover, we developed a modelling framework for detecting the early, high-frequency neural response to the uninterrupted speech in non-invasive neural recordings. We applied the method to investigate top-down modulation of this response by the listener's selective attention and linguistic properties of different words from a spoken narrative. We found that in both cases, the detected responses of predominantly subcortical origin were significantly modulated, which supports the functional role of feedback, between higher- and lower levels stages of the auditory pathways, in speech perception. The proposed computational models shed light on some of the poorly understood neural mechanisms underlying speech perception. The developed methods can be readily employed in future studies involving a range of experimental paradigms beyond these considered in this thesis.Open Acces

    Computational modeling of the auditory brainstem response to continuous speech.

    Get PDF
    OBJECTIVE: The auditory brainstem response can be recorded non-invasively from scalp electrodes and serves as an important clinical measure of hearing function. We have recently shown how the brainstem response at the fundamental frequency of continuous, non-repetitive speech can be measured, and have used this measure to demonstrate that the response is modulated by selective attention. However, different parts of the speech signal as well as several parts of the brainstem contribute to this response. Here we employ a computational model of the brainstem to elucidate the influence of these different factors. APPROACH: We developed a computational model of the auditory brainstem by combining a model of the middle and inner ear with a model of globular bushy cells in the cochlear nuclei and with a phenomenological model of the inferior colliculus. We then employed the model to investigate the neural response to continuous speech at different stages in the brainstem, following the methodology developed recently by ourselves for detecting the brainstem response to running speech from scalp recordings. We compared the simulations with recordings from healthy volunteers. MAIN RESULTS: We found that the auditory-nerve fibers, the cochlear nuclei and the inferior colliculus all contributed to the speech-evoked brainstem response, although the dominant contribution came from the inferior colliculus. The delay of the response corresponded to that observed in experiments. We further found that a broad range of harmonics of the fundamental frequency, up to about 8 kHz, contributed to the brainstem response. The response declined with increasing fundamental frequency, although the signal-to-noise ratio was largely unaffected. SIGNIFICANCE: Our results suggest that the scalp-recorded brainstem response at the fundamental frequency of speech originates predominantly in the inferior colliculus. They further show that the response is shaped by a large number of higher harmonics of the fundamental frequency, reflecting highly nonlinear processing in the auditory periphery and illustrating the complexity of the response

    DISSOCIABLE MECHANISMS OF CONCURRENT SPEECH IDENTIFICATION IN NOISE AT CORTICAL AND SUBCORTICAL LEVELS.

    Get PDF
    When two vowels with different fundamental frequencies (F0s) are presented concurrently, listeners often hear two voices producing different vowels on different pitches. Parsing of this simultaneous speech can also be affected by the signal-to-noise ratio (SNR) in the auditory scene. The extraction and interaction of F0 and SNR cues may occur at multiple levels of the auditory system. The major aims of this dissertation are to elucidate the neural mechanisms and time course of concurrent speech perception in clean and in degraded listening conditions and its behavioral correlates. In two complementary experiments, electrical brain activity (EEG) was recorded at cortical (EEG Study #1) and subcortical (FFR Study #2) levels while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero and four semitones (STs) presented in either clean or noise degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in identifying both vowels for larger F0 separations (i.e., 4ST; with pitch cues), and this F0-benefit was more pronounced at more favorable SNRs. Time-frequency analysis of cortical EEG oscillations (i.e., brain rhythms) revealed a dynamic time course for concurrent speech processing that depended on both extrinsic (SNR) and intrinsic (pitch) acoustic factors. Early high frequency activity reflected pre-perceptual encoding of acoustic features (~200 ms) and the quality (i.e., SNR) of the speech signal (~250-350ms), whereas later-evolving low-frequency rhythms (~400-500ms) reflected post-perceptual, cognitive operations that covaried with listening effort and task demands. Analysis of subcortical responses indicated that while FFRs provided a high-fidelity representation of double vowel stimuli and the spectro-temporal nonlinear properties of the peripheral auditory system. FFR activity largely reflected the neural encoding of stimulus features (exogenous coding) rather than perceptual outcomes, but timbre (F1) could predict the speed in noise conditions. Taken together, results of this dissertation suggest that subcortical auditory processing reflects mostly exogenous (acoustic) feature encoding in stark contrast to cortical activity, which reflects perceptual and cognitive aspects of concurrent speech perception. By studying multiple brain indices underlying an identical task, these studies provide a more comprehensive window into the hierarchy of brain mechanisms and time-course of concurrent speech processing

    Hearing It Again and Again: On-Line Subcortical Plasticity in Humans

    Get PDF
    Background: Human brainstem activity is sensitive to local sound statistics, as reflected in an enhanced response in repetitive compared to pseudo-random stimulus conditions [1]. Here we probed the short-term time course of this enhancement using a paradigm that assessed how the local sound statistics (i.e., repetition within a five-note melody) interact with more global statistics (i.e., repetition of the melody). Methodology/Principal Findings: To test the hypothesis that subcortical repetition enhancement builds over time, we recorded auditory brainstem responses in young adults to a five-note melody containing a repeated note, and monitored how the response changed over the course of 1.5 hrs. By comparing response amplitudes over time, we found a robust time-dependent enhancement to the locally repeating note that was superimposed on a weaker enhancement of the globally repeating pattern. Conclusions/Significance: We provide the first demonstration of on-line subcortical plasticity in humans. This complements previous findings that experience-dependent subcortical plasticity can occur on a number of time scales, including life-long experiences with music and language, and short-term auditory training. Our results suggest that the incoming stimulus stream is constantly being monitored, even when the stimulus is physically invariant and attention is directed elsewhere, to augment the neural response to the most statistically salient features of the ongoing stimulus stream. These real-tim

    Timing predictability enhances regularity encoding in the human subcortical auditory pathway

    Get PDF
    The encoding of temporal regularities is a critical property of the auditory system, as short-term neural representations of environmental statistics serve to auditory object formation and detection of potentially relevant novel stimuli. A putative neural mechanism underlying regularity encoding is repetition suppression, the reduction of neural activity to repeated stimulation. Although repetitive stimulation per se has shown to reduce auditory neural activity in animal cortical and subcortical levels and in the human cerebral cortex, other factors such as timing may influence the encoding of statistical regularities. This study was set out to investigate whether temporal predictability in the ongoing auditory input modulates repetition suppression in subcortical stages of the auditory processing hierarchy. Human auditory frequency-following responses (FFR) were recorded to a repeating consonant-vowel stimuli (/wa/) delivered in temporally predictable and unpredictable conditions. FFR amplitude was attenuated by repetition independently of temporal predictability, yet we observed an accentuated suppression when the incoming stimulation was temporally predictable. These findings support the view that regularity encoding spans across the auditory hierarchy and point to temporal predictability as a modulatory factor of regularity encoding in early stages of the auditory pathway

    Investigating the Relationship between Subcortical and Cortical Auditory Processing

    Get PDF
    The auditory system is highly integrative, with feedforward and feedback connections from periphery to cortex (and stages in between). In order to understand how the different levels of the human auditory system interact, it is necessary to simultaneously measure responses from multiple auditory levels. A novel stimulus was paired with electroencephalography (EEG) in 29 young, normal-hearing participants (17-34 years) to examine interactions among stages of the auditory pathway. Temporal regularity was manipulated by continuously accelerating and decelerating the rate of a click-train stimulus (i.e., ~3.5 Hz frequency modulation of the click rate). Adaptation of the brainstem (cochlear nucleus and inferior colliculus) response latencies was observed simultaneously with cortical phase-locking and sustained low frequency activity to the temporal regularity. However, no correlations were found between subcortical adaptation and cortical regularity responses, suggesting that these phenomena may be independent of one another

    Blast exposure in the military and its effects on sensory and cognitive auditory processing

    Full text link
    Blast-induced traumatic brain injury and hearing loss are two of the most common forms of the β€œinvisible wounds of war” resulting from the United States’ Global War on Terror. Several published studies have been confirming recent reports from VA healthcare centers of blast-exposed Service Members complaining of auditory problems despite having hearing that is, for all intents and purposes, normal. Most common among these complaints is problems understanding speech in crowded and noisy situations. We hypothesized that problems with speech comprehension could either be the result of 1) damage to sensory areas in the auditory periphery or 2) blast-induced traumatic brain injury (TBI) to cortical networks associated with the processing of attention, memory, and other executive functions related to the processing of speech and linguistic information. In Chapter 1 of this thesis, we found that in a population of blast-exposed Veteran Service Members, problems with speech comprehension in noise were due to cognitive deficits likely resulting from issues related to their post-traumatic stress disorder (PTSD) diagnoses. Chapter 2 takes and expanded look at the topics of Chapter 1 with a more comprehensive battery of audiological, electrophysiological, and neuropsychological tests in active duty Service Members with and without a history of blast exposure. Unlike in veterans with PTSD, we found subclinical levels of peripheral auditory dysfunction, as well as evidence of compromised neural processing speed in the blast-exposed group. These deficits were also consistent with poorer performance on a standardized speech-in-noise test and lower self-reported ratings on an abbreviated version of the Speech, Spatial, and Qualities (SSQ) of Hearing questionnaire (Gatehouse and Noble, 2004). In Chapter 3,we modeled outcomes from the SSQ survey using objective measures of hearing function related to audibility, distortion of the neural representation of sound, attention, age, and blast status. We found for all subjects age and high frequency hearing thresholds predicted survey outcomes related to everyday listening ability. Within non-blast controls, however, measures of attention could differentiate between good and exceptional listening ability. Results from blast exposed subjects remained inconclusive. Collectively, these findings highlight the need for audiologists to take into account more than audiometric measures alone when diagnosing and treating hearing dysfunction in this unique and specialized patient population
    • …
    corecore