    Functional anatomy of the masking level difference, an fMRI study

    Introduction: Masking level differences (MLDs) are differences in the hearing threshold for the detection of a signal presented in a noise background, where either the phase of the signal or noise is reversed between ears. We use N0/Nπ to denote noise presented in-phase/out-of-phase between ears and S0/Sπ to denote a 500 Hz sine wave signal as in/out-of-phase. Signal detection level for the noise/signal combinations N0Sπ and NπS0 is typically 10-20 dB better than for N0S0. All combinations have the same spectrum, level, and duration of both the signal and the noise. Methods: Ten participants (5 female), age: 22-43, with N0Sπ-N0S0 MLDs greater than 10 dB, were imaged using a sparse BOLD fMRI sequence, with a 9 second gap (1 second quiet preceding stimuli). Band-pass (400-600 Hz) noise and an enveloped signal (.25 second tone burst, 50% duty-cycle) were used to create the stimuli. Brain maps of statistically significant regions were formed from a second-level analysis using SPM5. Results: The contrast NπS0- N0Sπ had significant regions of activation in the right pulvinar, corpus callosum, and insula bilaterally. The left inferior frontal gyrus had significant activation for contrasts N0Sπ-N0S0 and NπS0-N0S0. The contrast N0S0-N0Sπ revealed a region in the right insula, and the contrast N0S0-NπS0 had a region of significance in the left insula. Conclusion: Our results extend the view that the thalamus acts as a gating mechanism to enable dichotic listening, and suggest that MLD processing is accomplished through thalamic communication with the insula, which communicate across the corpus callosum to either enhance or diminish the binaural signal (depending on the MLD condition). The audibility improvement of the signal with both MLD conditions is likely reflected by activation in the left inferior frontal gyrus, a late stage in the what/where model of auditory processing. © 2012 Wack et al

    Pitch discrimination in optimal and suboptimal acoustic environments : electroencephalographic, magnetoencephalographic, and behavioral evidence

    Pitch discrimination is a fundamental property of the human auditory system. Our understanding of pitch-discrimination mechanisms is important from both theoretical and clinical perspectives. The discrimination of spectrally complex sounds is crucial in the processing of music and speech. Current methods of cognitive neuroscience can track the brain processes underlying sound processing either with precise temporal (EEG and MEG) or spatial resolution (PET and fMRI). A combination of different techniques is therefore required in contemporary auditory research. One of the problems in comparing the EEG/MEG and fMRI methods, however, is the fMRI acoustic noise. In the present thesis, EEG and MEG in combination with behavioral techniques were used, first, to define the ERP correlates of automatic pitch discrimination across a wide frequency range in adults and neonates and, second, they were used to determine the effect of recorded acoustic fMRI noise on those adult ERP and ERF correlates during passive and active pitch discrimination. Pure tones and complex 3-harmonic sounds served as stimuli in the oddball and matching-to-sample paradigms. The results suggest that pitch discrimination in adults, as reflected by MMN latency, is most accurate in the 1000-2000 Hz frequency range, and that pitch discrimination is facilitated further by adding harmonics to the fundamental frequency. Newborn infants are able to discriminate a 20% frequency change in the 250-4000 Hz frequency range, whereas the discrimination of a 5% frequency change was unconfirmed. Furthermore, the effect of the fMRI gradient noise on the automatic processing of pitch change was more prominent for tones with frequencies exceeding 500 Hz, overlapping with the spectral maximum of the noise. When the fundamental frequency of the tones was lower than the spectral maximum of the noise, fMRI noise had no effect on MMN and P3a, whereas the noise delayed and suppressed N1 and exogenous N2. Noise also suppressed the N1 amplitude in a matching-to-sample working memory task. However, the task-related difference observed in the N1 component, suggesting a functional dissociation between the processing of spatial and non-spatial auditory information, was partially preserved in the noise condition. Noise hampered feature coding mechanisms more than it hampered the mechanisms of change detection, involuntary attention, and the segregation of the spatial and non-spatial domains of working-memory. The data presented in the thesis can be used to develop clinical ERP-based frequency-discrimination protocols and combined EEG and fMRI experimental paradigms.Kyky erottaa korkeat ja matalat äänet toisistaan on yksi aivojen perustoiminnoista. Ilman sitä emme voisi ymmärtää puhetta tai nauttia musiikista. Jotkut potilaat ja hyvin pienet lapset eivät pysty itse kertomaan, kuulevatko he eron vai eivät, mutta heidän aivovasteensa voivat paljastaa sen. Sävelkorkeuden erotteluun liittyvistä aivotoiminnoista ei kuitenkaan tiedetä tarpeeksi edes terveillä aikuisilla. Siksi tarvitaan lisää tämän aihepiirin tutkimusta, jossa käytetään nykyaikaisia aivotutkimusmenetelmiä, kuten tapahtumasidonnaisia herätevasteita (engl. event-related potential, ERP) ja toiminnallista magneettikuvausta (engl. functional magnetic resonance imaging, fMRI). ERP-menetelmä paljastaa, milloin aivot erottavat sävelkorkeuseron, kun taas fMRI paljastaa, mitkä aivoalueet ovat aktivoituneet tässä toiminnossa. Yhdistämällä nämä kaksi menetelmää voidaan saada kokonaisvaltaisempi kuva sävelkorkeuden erotteluun liittyvistä aivotoiminnoista. fMRI-menetelmään liittyy kuitenkin eräs ongelma, nimittäin fMRI-laitteen synnyttämä kova melu, joka voi vaikeuttaa kuuloon liittyvää tutkimusta. Tässä väitÜskirjassa tutkitaan, kuinka sävelkorkeuden erottelu voidaan todeta aikuisten ja vastasyntyneiden vauvojen aivoissa ja kuinka fMRI-laitteen melu vaikuttaa kuuloärsykkeiden synnyttämiin ERP-vasteisiin. Tutkimuksen tulokset osoittavat, että aikuisen aivot voivat erottaa niinkin pieniä kuin 2,5 %:n taajuuseroja, mutta erottelu tapahtuu nopeammin n. 1000-2000 Hz:n taajuudella kuin matalammilla tai korkeammilla taajuuksilla. Vastasyntyneen vauvan aivot erottelivat vain yli 20 %:n taajuusmuutoksia. Kun taustalla soitettiin fMRI-laitteen melua, se vaimensi aivovasteita 500-2000 Hz:n äänille enemmän kuin muille äänille. Melu ei kuitenkaan vaikuttanut alle 500 Hz:n äänten synnyttämiin aivovasteisiin. Riippumatta siitä, esitettiinkÜ taustalla melua vai ei, äänilähteen paikan muutoksen synnyttämä ERP-vaste oli suurempi kuin äänenkorkeuden muutoksen synnyttämä vaste. Tämä väitÜskirjatutkimus on osoittanut, että sävelkorkeuden erottelua voidaan tutkia tehokkaasti ERP-menetelmällä sekä aikuisilla että vauvoilla. Tulosten mukaan ERP- ja fMRI-menetelmien yhdistämistä voidaan tehostaa ottamalla kokeiden suunnittelussa huomioon fMRI-laitteen melun vaikutukset ERP-vasteisiin. Tutkimuksen aineistoa voidaan hyÜdyntää monimutkaisten sävelkorkeuden erottelua mittaavien kokeiden suunnittelussa mm. potilailla ja lapsilla

    Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans

    Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition is impaired due to aging or disabilities, seeing mouth movements greatly improves speech perception. Although behavioral studies have well established this perceptual benefit, it is still not clear how the brain processes visual information from mouth movements to improve speech perception. To clarify this issue, I studied the neural activity recorded from the brain surfaces of human subjects using intracranial electrodes, a technique known as electrocorticography (ECoG). First, I studied responses to noisy speech in the auditory cortex, specifically in the superior temporal gyrus (STG). Previous studies identified the anterior parts of the STG as unisensory, responding only to auditory stimulus. On the other hand, posterior parts of the STG are known to be multisensory, responding to both auditory and visual stimuli, which makes it a key region for audiovisual speech perception. I examined how these different parts of the STG respond to clear versus noisy speech. I found that noisy speech decreased the amplitude and increased the across-trial variability of the response in the anterior STG. However, possibly due to its multisensory composition, posterior STG was not as sensitive to auditory noise as the anterior STG and responded similarly to clear and noisy speech. I also found that these two response patterns in the STG were separated by a sharp boundary demarcated by the posterior-most portion of the Heschl’s gyrus. Second, I studied responses to silent speech in the visual cortex. Previous studies demonstrated that visual cortex shows response enhancement when the auditory component of speech is noisy or absent, however it was not clear which regions of the visual cortex specifically show this response enhancement and whether this response enhancement is a result of top-down modulation from a higher region. To test this, I first mapped the receptive fields of different regions in the visual cortex and then measured their responses to visual (silent) and audiovisual speech stimuli. I found that visual regions that have central receptive fields show greater response enhancement to visual speech, possibly because these regions receive more visual information from mouth movements. I found similar response enhancement to visual speech in frontal cortex, specifically in the inferior frontal gyrus, premotor and dorsolateral prefrontal cortices, which have been implicated in speech reading in previous studies. I showed that these frontal regions display strong functional connectivity with visual regions that have central receptive fields during speech perception

    Stochastic Resonance Modulates Neural Synchronization within and between Cortical Sources

    Neural synchronization is a mechanism whereby functionally specific brain regions establish transient networks for perception, cognition, and action. Direct addition of weak noise (fast random fluctuations) to various neural systems enhances synchronization through the mechanism of stochastic resonance (SR). Moreover, SR also occurs in human perception, cognition, and action. Perception, cognition, and action are closely correlated with, and may depend upon, synchronized oscillations within specialized brain networks. We tested the hypothesis that SR-mediated neural synchronization occurs within and between functionally relevant brain areas and thus could be responsible for behavioral SR. We measured the 40-Hz transient response of the human auditory cortex to brief pure tones. This response arises when the ongoing, random-phase, 40-Hz activity of a group of tuned neurons in the auditory cortex becomes synchronized in response to the onset of an above-threshold sound at its “preferred” frequency. We presented a stream of near-threshold standard sounds in various levels of added broadband noise and measured subjects' 40-Hz response to the standards in a deviant-detection paradigm using high-density EEG. We used independent component analysis and dipole fitting to locate neural sources of the 40-Hz response in bilateral auditory cortex, left posterior cingulate cortex and left superior frontal gyrus. We found that added noise enhanced the 40-Hz response in all these areas. Moreover, added noise also increased the synchronization between these regions in alpha and gamma frequency bands both during and after the 40-Hz response. Our results demonstrate neural SR in several functionally specific brain regions, including areas not traditionally thought to contribute to the auditory 40-Hz transient response. In addition, we demonstrated SR in the synchronization between these brain regions. Thus, both intra- and inter-regional synchronization of neural activity are facilitated by the addition of moderate amounts of random noise. Because the noise levels in the brain fluctuate with arousal system activity, particularly across sleep-wake cycles, optimal neural noise levels, and thus SR, could be involved in optimizing the formation of task-relevant brain networks at several scales under normal conditions

    Attentional Modulation of Envelope-Following Responses at Lower (93–109 Hz) but Not Higher (217–233 Hz) Modulation Rates

    Directing attention to sounds of different frequencies allows listeners to perceive a sound of interest, like a talker, in a mixture. Whether cortically generated frequency-specific attention affects responses as low as the auditory brainstem is currently unclear. Participants attended to either a high- or low-frequency tone stream, which was presented simultaneously and tagged with different amplitude modulation (AM) rates. In a replication design, we showed that envelope-following responses (EFRs) were modulated by attention only when the stimulus AM rate was slow enough for the auditory cortex to track—and not for stimuli with faster AM rates, which are thought to reflect ‘purer’ brainstem sources. Thus, we found no evidence of frequency-specific attentional modulation that can be confidently attributed to brainstem generators. The results demonstrate that different neural populations contribute to EFRs at higher and lower rates, compatible with cortical contributions at lower rates. The results further demonstrate that stimulus AM rate can alter conclusions of EFR studies.This work was supported by funding from the Canadian Institutes of Health Research (CIHR; Operating Grant: MOP 133450) and the Natural Sciences and Engineering Research Council of Canada (NSERC; Discovery Grant: 327429-2012). Authors R.P. Carlyon and H.E. Gockel were supported by intramural funding from the Medical Research Council [SUAG/007 RG91365]


    When two vowels with different fundamental frequencies (F0s) are presented concurrently, listeners often hear two voices producing different vowels on different pitches. Parsing of this simultaneous speech can also be affected by the signal-to-noise ratio (SNR) in the auditory scene. The extraction and interaction of F0 and SNR cues may occur at multiple levels of the auditory system. The major aims of this dissertation are to elucidate the neural mechanisms and time course of concurrent speech perception in clean and in degraded listening conditions and its behavioral correlates. In two complementary experiments, electrical brain activity (EEG) was recorded at cortical (EEG Study #1) and subcortical (FFR Study #2) levels while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero and four semitones (STs) presented in either clean or noise degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in identifying both vowels for larger F0 separations (i.e., 4ST; with pitch cues), and this F0-benefit was more pronounced at more favorable SNRs. Time-frequency analysis of cortical EEG oscillations (i.e., brain rhythms) revealed a dynamic time course for concurrent speech processing that depended on both extrinsic (SNR) and intrinsic (pitch) acoustic factors. Early high frequency activity reflected pre-perceptual encoding of acoustic features (~200 ms) and the quality (i.e., SNR) of the speech signal (~250-350ms), whereas later-evolving low-frequency rhythms (~400-500ms) reflected post-perceptual, cognitive operations that covaried with listening effort and task demands. Analysis of subcortical responses indicated that while FFRs provided a high-fidelity representation of double vowel stimuli and the spectro-temporal nonlinear properties of the peripheral auditory system. FFR activity largely reflected the neural encoding of stimulus features (exogenous coding) rather than perceptual outcomes, but timbre (F1) could predict the speed in noise conditions. Taken together, results of this dissertation suggest that subcortical auditory processing reflects mostly exogenous (acoustic) feature encoding in stark contrast to cortical activity, which reflects perceptual and cognitive aspects of concurrent speech perception. By studying multiple brain indices underlying an identical task, these studies provide a more comprehensive window into the hierarchy of brain mechanisms and time-course of concurrent speech processing
