85 research outputs found

    Examining the McGurk illusion using high-field 7 Tesla functional MRI

    Get PDF
    In natural communication speech perception is profoundly influenced by observable mouth movements. The additional visual information can greatly facilitate intelligibility but incongruent visual information may also lead to novel percepts that neither match the auditory nor the visual information as evidenced by the McGurk effect. Recent models of audiovisual (AV) speech perception accentuate the role of speech motor areas and the integrative brain sites in the vicinity of the superior temporal sulcus (STS) for speech perception. In this event-related 7 Tesla fMRI study we used three naturally spoken syllable pairs with matching AV information and one syllable pair designed to elicit the McGurk illusion. The data analysis focused on brain sites involved in processing and fusing of AV speech and engaged in the analysis of auditory and visual differences within AV presented speech. Successful fusion of AV speech is related to activity within the STS of both hemispheres. Our data supports and extends the audio-visual-motor model of speech perception by dissociating areas involved in perceptual fusion from areas more generally related to the processing of AV incongruence

    An Object-Based Interpretation of Audiovisual Processing

    Get PDF
    Visual cues help listeners follow conversation in a complex acoustic environment. Many audiovisual research studies focus on how sensory cues are combined to optimize perception, either in terms of minimizing the uncertainty in the sensory estimate or maximizing intelligibility, particularly in speech understanding. From an auditory perception perspective, a fundamental question that has not been fully addressed is how visual information aids the ability to select and focus on one auditory object in the presence of competing sounds in a busy auditory scene. In this chapter, audiovisual integration is presented from an object-based attention viewpoint. In particular, it is argued that a stricter delineation of the concepts of multisensory integration versus binding would facilitate a deeper understanding of the nature of how information is combined across senses. Furthermore, using an object-based theoretical framework to distinguish binding as a distinct form of multisensory integration generates testable hypotheses with behavioral predictions that can account for different aspects of multisensory interactions. In this chapter, classic multisensory illusion paradigms are revisited and discussed in the context of multisensory binding. The chapter also describes multisensory experiments that focus on addressing how visual stimuli help listeners parse complex auditory scenes. Finally, it concludes with a discussion of the potential mechanisms by which audiovisual processing might resolve competition between concurrent sounds in order to solve the cocktail party problem

    Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans

    Get PDF
    Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition is impaired due to aging or disabilities, seeing mouth movements greatly improves speech perception. Although behavioral studies have well established this perceptual benefit, it is still not clear how the brain processes visual information from mouth movements to improve speech perception. To clarify this issue, I studied the neural activity recorded from the brain surfaces of human subjects using intracranial electrodes, a technique known as electrocorticography (ECoG). First, I studied responses to noisy speech in the auditory cortex, specifically in the superior temporal gyrus (STG). Previous studies identified the anterior parts of the STG as unisensory, responding only to auditory stimulus. On the other hand, posterior parts of the STG are known to be multisensory, responding to both auditory and visual stimuli, which makes it a key region for audiovisual speech perception. I examined how these different parts of the STG respond to clear versus noisy speech. I found that noisy speech decreased the amplitude and increased the across-trial variability of the response in the anterior STG. However, possibly due to its multisensory composition, posterior STG was not as sensitive to auditory noise as the anterior STG and responded similarly to clear and noisy speech. I also found that these two response patterns in the STG were separated by a sharp boundary demarcated by the posterior-most portion of the Heschl’s gyrus. Second, I studied responses to silent speech in the visual cortex. Previous studies demonstrated that visual cortex shows response enhancement when the auditory component of speech is noisy or absent, however it was not clear which regions of the visual cortex specifically show this response enhancement and whether this response enhancement is a result of top-down modulation from a higher region. To test this, I first mapped the receptive fields of different regions in the visual cortex and then measured their responses to visual (silent) and audiovisual speech stimuli. I found that visual regions that have central receptive fields show greater response enhancement to visual speech, possibly because these regions receive more visual information from mouth movements. I found similar response enhancement to visual speech in frontal cortex, specifically in the inferior frontal gyrus, premotor and dorsolateral prefrontal cortices, which have been implicated in speech reading in previous studies. I showed that these frontal regions display strong functional connectivity with visual regions that have central receptive fields during speech perception

    Parietal disruption alters audiovisual binding in the sound-induced flash illusion

    Get PDF
    Selective attention and multisensory integration are fundamental to perception, but little is known about whether, or under what circumstances, these processes interact to shape conscious awareness. Here, we used transcranial magnetic stimulation (TMS) to investigate the causal role of attention-related brain networks in multisensory integration between visual and auditory stimuli in the sound-induced flash illusion. The flash illusion is a widely studied multisensory phenomenon in which a single flash of light is falsely perceived as multiple flashes in the presence of irrelevant sounds. We investigated the hypothesis that extrastriate regions involved in selective attention, specifically within the right parietal cortex, exert an influence on the multisensory integrative processes that cause the flash illusion. We found that disruption of the right angular gyrus, but not of the adjacent supramarginal gyrus or of a sensory control site, enhanced participants' veridical perception of the multisensory events, thereby reducing their susceptibility to the illusion. Our findings suggest that the same parietal networks that normally act to enhance perception of attended events also play a role in the binding of auditory and visual stimuli in the sound-induced flash illusion

    Electrophysiological assessment of audiovisual integration in speech perception

    Get PDF

    NEURAL CORRELATES OF AUDIOVISUAL SPEECH PERCEPTION IN APHASIA AND HEALTHY AGING

    Get PDF
    Understanding speech in face-to-face conversation utilizes the integration of multiple pieces of information, most importantly the auditory vocal sounds and visual lip movements. Prior studies of the neural underpinnings of audiovisual integration in the brain have provided converging evidence to suggest that neurons within the left superior temporal sulcus (STS) provide a critical neural hub for the integration of auditory and visual information in speech. While most studies of audiovisual processing focus on neural mechanisms within healthy, young adults, we currently know very little about how changes to the brain can affect audiovisual integration in speech. To examine this further, two particular cases of changing neural structure were investigated. I first conducted a case study with patient SJ, who suffered damage from a stroke that injured a large portion of her left tempo-parietal area, including the left STS. I tested SJ five years after her stroke with behavioral testing and determined that she is able to integrate auditory and visual information in speech. In order to understand the neural basis of SJ’s intact multisensory integration abilities, I examined her and 23 age-matched controls with functional magnetic resonance imaging (fMRI). SJ had a greater volume of multisensory cortex as well as greater response amplitude in her right STS in response to an audiovisual speech illusion than the age-matched controls. This evidence suggests that SJ’s brain reorganized after her stroke such that the right STS now supports the functions of the stroke damaged left-sided cortex. Because changes to the brain occur even with healthy aging, I next examined the neural response to audiovisual speech in healthy older adults. Many behavioral studies have noted that older adults show not only performance declines during various sensory and cognitive tasks, but also greater variability in performance. I sought to determine if there is a neural counterpart to this increased behavioral variability. I found that older adults exhibited greater intrasubject variability in their neural responses across trials compared to younger adults. This was true in individual regions-of-interest in the multisensory speech perception network and across all brain voxels that responded to speech stimuli. This increase in variability may underlie a decreased ability of the brain to distinguish between similar stimuli (such as the categorical boundaries of speech perception), which could link these findings to declines in speech perception in aging

    Meta-analyses support a taxonomic model for representations of different categories of audio-visual interaction events in the human brain

    Get PDF
    Our ability to perceive meaningful action events involving objects, people and other animate agents is characterized in part by an interplay of visual and auditory sensory processing and their cross-modal interactions. However, this multisensory ability can be altered or dysfunctional in some hearing and sighted individuals, and in some clinical populations. The present meta-analysis sought to test current hypotheses regarding neurobiological architectures that may mediate audio-visual multisensory processing. Reported coordinates from 82 neuroimaging studies (137 experiments) that revealed some form of audio-visual interaction in discrete brain regions were compiled, converted to a common coordinate space, and then organized along specific categorical dimensions to generate activation likelihood estimate (ALE) brain maps and various contrasts of those derived maps. The results revealed brain regions (cortical “hubs”) preferentially involved in multisensory processing along different stimulus category dimensions, including (1) living versus non-living audio-visual events, (2) audio-visual events involving vocalizations versus actions by living sources, (3) emotionally valent events, and (4) dynamic-visual versus static-visual audio-visual stimuli. These meta-analysis results are discussed in the context of neurocomputational theories of semantic knowledge representations and perception, and the brain volumes of interest are available for download to facilitate data interpretation for future neuroimaging studies

    No “Self” Advantage for Audiovisual Speech Aftereffects

    Get PDF
    Published: 22 March 2019.Although the default state of the world is that we see and hear other people talking, there is evidence that seeing and hearing ourselves rather than someone else may lead to visual (i.e., lip-read) or auditory “self” advantages. We assessed whether there is a “self” advantage for phonetic recalibration (a lip-read driven cross-modal learning effect) and selective adaptation (a contrastive effect in the opposite direction of recalibration). We observed both aftereffects as well as an on-line effect of lip-read information on auditory perception (i.e., immediate capture), but there was no evidence for a “self” advantage in any of the tasks (as additionally supported by Bayesian statistics). These findings strengthen the emerging notion that recalibration reflects a general learning mechanism, and bolster the argument that adaptation depends on rather low-level auditory/acoustic features of the speech signal.This work was supported by the Severo Ochoa program grant SEV-2015-049 awarded to the BCBL. MB and MP were supported by the Spanish Ministry of Economy and Competitiveness (MINECO, grant PSI2014-51874-P), and MB was also supported by the Netherlands Organization for Scientific Research (NWO, VENI grant 275-89-027)

    Long-Term Consequences of Early Eye Enucleation on Audiovisual Processing

    Get PDF
    A growing body of research shows that complete deprivation of the visual system from the loss of both eyes early in life results in changes in the remaining senses. Is the adaptive plasticity observed in the remaining intact senses also found in response to partial sensory deprivation specifically, the loss of one eye early in life? My dissertation examines evidence of adaptive plasticity following the loss of one eye (unilateral enucleation) early in life. Unilateral eye enucleation is a unique model for examining the consequences of the loss of binocularity since the brain is completely deprived of all visual input from that eye. My dissertation expands our understanding of the long-term effects of losing one eye early in life on the development of audiovisual processing both behaviourally and in terms of the underlying neural representation. The over-arching goal is to better understand neural plasticity as a result of sensory deprivation. To achieve this I conducted seven experiments, divided into 5 experimental chapters, that focus on the behavioural and structural correlates of audiovisual perception in a unique group of adults who lost one eye in the first few years of life. Behavioural data (Chapters II-V) in conjunction with neuroimaging data (Chapter VI) relate structure and function of the auditory, visual and audiovisual systems in this rare patient group allowing a more refined understanding of cross sensory effects of early sensory deprivation. This information contributes to us better understanding how audiovisual information is experienced by people with one eye. This group can be used as a model to learn how to accommodate and maintain the health of less extreme forms of visual deprivation and to promote overall long-term visual health
    corecore