13 research outputs found

    Substituting facial movements in singers changes the sounds of musical intervals

    Get PDF
    Cross-modal integration is ubiquitous within perception and, in humans, the McGurk effect demonstrates that seeing a person articulating speech can change what we hear into a new auditory percept. It remains unclear whether cross-modal integration of sight and sound generalizes to other visible vocal articulations like those made by singers. We surmise that perceptual integrative effects should involve music deeply, since there is ample indeterminacy and variability in its auditory signals. We show that switching videos of sung musical intervals changes systematically the estimated distance between two notes of a musical interval so that pairing the video of a smaller sung interval to a relatively larger auditory led to compression effects on rated intervals, whereas the reverse led to a stretching effect. In addition, after seeing a visually switched video of an equally-tempered sung interval and then hearing the same interval played on the piano, the two intervals were judged often different though they differed only in instrument. These findings reveal spontaneous, cross-modal, integration of vocal sounds and clearly indicate that strong integration of sound and sight can occur beyond the articulations of natural speech

    Multisensory integration augmenting motor processes among older adults

    Get PDF
    ObjectiveMultisensory integration enhances sensory processing in older adults. This study aimed to investigate how the sensory enhancement would modulate the motor related process in healthy older adults.MethodThirty-one older adults (12 males, mean age 67.7 years) and 29 younger adults as controls (16 males, mean age 24.9 years) participated in this study. Participants were asked to discriminate spatial information embedded in the unisensory (visual or audial) and multisensory (audiovisual) conditions. The responses made by the movements of the left and right wrists corresponding to the spatial information were registered with specially designed pads. The electroencephalogram (EEG) marker was the event-related super-additive P2 in the frontal-central region, the stimulus-locked lateralized readiness potentials (s-LRP) and response-locked lateralized readiness potentials (r-LRP).ResultsOlder participants showed significantly faster and more accurate responses than controls in the multisensory condition than in the unisensory conditions. Both groups had significantly less negative-going s-LRP amplitudes elicited at the central sites in the between-condition contrasts. However, only the older group showed significantly less negative-going, centrally distributed r-LRP amplitudes. More importantly, only the r-LRP amplitude in the audiovisual condition significantly predicted behavioral performance.ConclusionAudiovisual integration enhances reaction time, which associates with modulated motor related processes among the older participants. The super-additive effects modulate both the motor preparation and generation processes. Interestingly, only the modulated motor generation process contributes to faster reaction time. As such effects were observed in older but not younger participants, multisensory integration likely augments motor functions in those with age-related neurodegeneration

    An Object-Based Interpretation of Audiovisual Processing

    Get PDF
    Visual cues help listeners follow conversation in a complex acoustic environment. Many audiovisual research studies focus on how sensory cues are combined to optimize perception, either in terms of minimizing the uncertainty in the sensory estimate or maximizing intelligibility, particularly in speech understanding. From an auditory perception perspective, a fundamental question that has not been fully addressed is how visual information aids the ability to select and focus on one auditory object in the presence of competing sounds in a busy auditory scene. In this chapter, audiovisual integration is presented from an object-based attention viewpoint. In particular, it is argued that a stricter delineation of the concepts of multisensory integration versus binding would facilitate a deeper understanding of the nature of how information is combined across senses. Furthermore, using an object-based theoretical framework to distinguish binding as a distinct form of multisensory integration generates testable hypotheses with behavioral predictions that can account for different aspects of multisensory interactions. In this chapter, classic multisensory illusion paradigms are revisited and discussed in the context of multisensory binding. The chapter also describes multisensory experiments that focus on addressing how visual stimuli help listeners parse complex auditory scenes. Finally, it concludes with a discussion of the potential mechanisms by which audiovisual processing might resolve competition between concurrent sounds in order to solve the cocktail party problem

    Koherencja drogowskazem prawdy? Spójność jako źródło błędnych reprezentacji

    Get PDF
    W niniejszym artykule próbuję rozwinąć i uogólnić Krystyny Bieleckiej koherencyjną teorię rozpoznawania błędnych reprezentacji. Próbuję pokazać, że koncepcja opisana w książce "Błądzę, więc myślę" (2019) może być wstępem do szerszej historii. Rozważam hipotezę, zgodnie z którą koherencja odgrywa zasadniczą rolę w procesie ewaluacji reprezentacji umysłowych, a przez to w ich nabywaniu i odrzucaniu. System poznawczy aktywnie maksymalizuje koherencję reprezentacji, działając zgodnie z zasadą Koherencji jako Drogowskazu Prawdy (zasadą KDP). Oparte na niespójności wykrywanie błędów – główny przedmiot zainteresowania koncepcji Krystyny Bieleckiej – stanowi tylko jedną z form takiego wykorzystania koherencji. Wskazuję kilka potencjalnych kontrprzykładów dla zasady KDP. Chodzi o koncepcje i modele procesów poznawczych (dotyczące percepcji, kontroli ruchu, pamięci epizodycznej i neutralizowania dysonansu poznawczego), w świetle których maksymalizacja koherencji systematycznie prowadzi do wytwarzania błędnych reprezentacji. Argumentuję, że te kontrprzykłady są tylko pozorne i nie uderzają w zasadę KDP jako podstawę „higieny epistemicznej” systemu poznawczego

    A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.

    No full text
    Audiovisual speech integration combines information from auditory speech (talker's voice) and visual speech (talker's mouth movements) to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same source, a process known as causal inference. A well-known illusion, the McGurk Effect, consists of incongruent audiovisual syllables, such as auditory "ba" + visual "ga" (AbaVga), that are integrated to produce a fused percept ("da"). This illusion raises two fundamental questions: first, given the incongruence between the auditory and visual syllables in the McGurk stimulus, why are they integrated; and second, why does the McGurk effect not occur for other, very similar syllables (e.g., AgaVba). We describe a simplified model of causal inference in multisensory speech perception (CIMS) that predicts the perception of arbitrary combinations of auditory and visual speech. We applied this model to behavioral data collected from 60 subjects perceiving both McGurk and non-McGurk incongruent speech stimuli. The CIMS model successfully predicted both the audiovisual integration observed for McGurk stimuli and the lack of integration observed for non-McGurk stimuli. An identical model without causal inference failed to accurately predict perception for either form of incongruent speech. The CIMS model uses causal inference to provide a computational framework for studying how the brain performs one of its most important tasks, integrating auditory and visual speech cues to allow us to communicate with others

    Investigating the Neural Basis of Audiovisual Speech Perception with Intracranial Recordings in Humans

    Get PDF
    Speech is inherently multisensory, containing auditory information from the voice and visual information from the mouth movements of the talker. Hearing the voice is usually sufficient to understand speech, however in noisy environments or when audition is impaired due to aging or disabilities, seeing mouth movements greatly improves speech perception. Although behavioral studies have well established this perceptual benefit, it is still not clear how the brain processes visual information from mouth movements to improve speech perception. To clarify this issue, I studied the neural activity recorded from the brain surfaces of human subjects using intracranial electrodes, a technique known as electrocorticography (ECoG). First, I studied responses to noisy speech in the auditory cortex, specifically in the superior temporal gyrus (STG). Previous studies identified the anterior parts of the STG as unisensory, responding only to auditory stimulus. On the other hand, posterior parts of the STG are known to be multisensory, responding to both auditory and visual stimuli, which makes it a key region for audiovisual speech perception. I examined how these different parts of the STG respond to clear versus noisy speech. I found that noisy speech decreased the amplitude and increased the across-trial variability of the response in the anterior STG. However, possibly due to its multisensory composition, posterior STG was not as sensitive to auditory noise as the anterior STG and responded similarly to clear and noisy speech. I also found that these two response patterns in the STG were separated by a sharp boundary demarcated by the posterior-most portion of the Heschl’s gyrus. Second, I studied responses to silent speech in the visual cortex. Previous studies demonstrated that visual cortex shows response enhancement when the auditory component of speech is noisy or absent, however it was not clear which regions of the visual cortex specifically show this response enhancement and whether this response enhancement is a result of top-down modulation from a higher region. To test this, I first mapped the receptive fields of different regions in the visual cortex and then measured their responses to visual (silent) and audiovisual speech stimuli. I found that visual regions that have central receptive fields show greater response enhancement to visual speech, possibly because these regions receive more visual information from mouth movements. I found similar response enhancement to visual speech in frontal cortex, specifically in the inferior frontal gyrus, premotor and dorsolateral prefrontal cortices, which have been implicated in speech reading in previous studies. I showed that these frontal regions display strong functional connectivity with visual regions that have central receptive fields during speech perception

    Ageing and multisensory integration: A review of the evidence, and a computational perspective

    Get PDF
    The processing of multisensory signals is crucial for effective interaction with the environment, but our ability to perform this vital function changes as we age. In the first part of this review, we summarise existing research into the effects of healthy ageing on multisensory integration. We note that age differences vary substantially with the paradigms and stimuli used: older adults often receive at least as much benefit (to both accuracy and response times) as younger controls from congruent multisensory stimuli, but are also consistently more negatively impacted by the presence of intersensory conflict. In the second part, we outline a normative Bayesian framework that provides a principled and computationally informed perspective on the key ingredients involved in multisensory perception, and how these are affected by ageing. Applying this framework to the existing literature, we conclude that changes to sensory reliability, prior expectations (together with attentional control), and decisional strategies all contribute to the age differences observed. However, we find no compelling evidence of any age-related changes to the basic inference mechanisms involved in multisensory perception
    corecore