822 research outputs found

    A phonologically congruent sound boosts a visual target into perceptual awareness

    Get PDF
    Capacity limitations of attentional resources allow only a fraction of sensory inputs to enter our awareness. Most prominently, in the attentional blink the observer often fails to detect the second of two rapidly successive targets that are presented in a sequence of distractor items. To investigate how auditory inputs enable a visual target to escape the attentional blink, this study presented the visual letter targets T1 and T2 together with phonologically congruent or incongruent spoken letter names. First, a congruent relative to an incongruent sound at T2 rendered visual T2 more visible. Second, this T2 congruency effect was amplified when the sound was congruent at T1 as indicated by a T1 congruency × T2 congruency interaction. Critically, these effects were observed both when the sounds were presented in synchrony with and prior to the visual target letters suggesting that the sounds may increase visual target identification via multiple mechanisms such as audiovisual priming or decisional interactions. Our results demonstrate that a sound around the time of T2 increases subjects' awareness of the visual target as a function of T1 and T2 congruency. Consistent with Bayesian causal inference, the brain may thus combine (1) prior congruency expectations based on T1 congruency and (2) phonological congruency cues provided by the audiovisual inputs at T2 to infer whether auditory and visual signals emanate from a common source and should hence be integrated for perceptual decisions

    Metacognition in the audiovisual McGurk illusion:perceptual and causal confidence

    Get PDF
    Almost all decisions in everyday life rely on multiple sensory inputs that can come from common or independent causes. These situations invoke perceptual uncertainty about environmental properties and the signals' causal structure. Using the audiovisual McGurk illusion, this study investigated how observers formed perceptual and causal confidence judgements in information integration tasks under causal uncertainty. Observers were presented with spoken syllables, their corresponding articulatory lip movements or their congruent and McGurk combinations (e.g. auditory B/P with visual G/K). Observers reported their perceived auditory syllable, the causal structure and confidence for each judgement. Observers were more accurate and confident on congruent than unisensory trials. Their perceptual and causal confidence were tightly related over trials as predicted by the interactive nature of perceptual and causal inference. Further, observers assigned comparable perceptual and causal confidence to veridical 'G/K' percepts on audiovisual congruent trials and their causal and perceptual metamers on McGurk trials (i.e. illusory 'G/K' percepts). Thus, observers metacognitively evaluate the integrated audiovisual percept with limited access to the conflicting unisensory stimulus components on McGurk trials. Collectively, our results suggest that observers form meaningful perceptual and causal confidence judgements about multisensory scenes that are qualitatively consistent with principles of Bayesian causal inference. This article is part of the theme issue 'Decision and control processes in multisensory perception'.</p

    Distinct computational principles govern multisensory integration in primary sensory and association cortices

    Get PDF
    Human observers typically integrate sensory signals in a statistically optimal fashion into a coherent percept by weighting them in proportion to their reliabilities [1, 2, 3 and 4]. An emerging debate in neuroscience is to which extent multisensory integration emerges already in primary sensory areas or is deferred to higher-order association areas [5, 6, 7, 8 and 9]. This fMRI study used multivariate pattern decoding to characterize the computational principles that define how auditory and visual signals are integrated into spatial representations across the cortical hierarchy. Our results reveal small multisensory influences that were limited to a spatial window of integration in primary sensory areas. By contrast, parietal cortices integrated signals weighted by their sensory reliabilities and task relevance in line with behavioral performance and principles of statistical optimality. Intriguingly, audiovisual integration in parietal cortices was attenuated for large spatial disparities when signals were unlikely to originate from a common source. Our results demonstrate that multisensory interactions in primary and association cortices are governed by distinct computational principles. In primary visual cortices, spatial disparity controlled the influence of non-visual signals on the formation of spatial representations, whereas in parietal cortices, it determined the influence of task-irrelevant signals. Critically, only parietal cortices integrated signals weighted by their bottom-up reliabilities and top-down task relevance into multisensory spatial priority maps to guide spatial orienting

    The curious incident of attention in multisensory integration : bottom-up vs. top-down

    Get PDF
    The role attention plays in our experience of a coherent, multisensory world is still controversial. On the one hand, a subset of inputs may be selected for detailed processing and multisensory integration in a top-down manner, i.e., guidance of multisensory integration by attention. On the other hand, stimuli may be integrated in a bottom-up fashion according to low-level properties such as spatial coincidence, thereby capturing attention. Moreover, attention itself is multifaceted and can be describedviaboth top-down and bottom-up mechanisms. Thus, the interaction between attention and multisensory integration is complex and situation-dependent. The authors of this opinion paper are researchers who have contributed to this discussion from behavioural, computational and neurophysiological perspectives. We posed a series of questions, the goal of which was to illustrate the interplay between bottom-up and top-down processes in various multisensory scenarios in order to clarify the standpoint taken by each author and with the hope of reaching a consensus. Although divergence of viewpoint emerges in the current responses, there is also considerable overlap: In general, it can be concluded that the amount of influence that attention exerts on MSI depends on the current task as well as prior knowledge and expectations of the observer. Moreover stimulus properties such as the reliability and salience also determine how open the processing is to influences of attention.</jats:p

    Conditioned sounds enhance visual processing

    Get PDF
    This psychophysics study investigated whether prior auditory conditioning influences how a sound interacts with visual perception. In the conditioning phase, subjects were presented with three pure tones ( =  conditioned stimuli, CS) that were paired with positive, negative or neutral unconditioned stimuli. As unconditioned reinforcers we employed pictures (highly pleasant, unpleasant and neutral) or monetary outcomes (+50 euro cents, −50 cents, 0 cents). In the subsequent visual selective attention paradigm, subjects were presented with near-threshold Gabors displayed in their left or right hemifield. Critically, the Gabors were presented in synchrony with one of the conditioned sounds. Subjects discriminated whether the Gabors were presented in their left or right hemifields. Participants determined the location more accurately when the Gabors were presented in synchrony with positive relative to neutral sounds irrespective of reinforcer type. Thus, previously rewarded relative to neutral sounds increased the bottom-up salience of the visual Gabors. Our results are the first demonstration that prior auditory conditioning is a potent mechanism to modulate the effect of sounds on visual perception

    Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy

    Get PDF
    To form a percept of the multisensory world, the brain needs to integrate signals from common sources weighted by their reliabilities and segregate those from independent sources. Previously, we have shown that anterior parietal cortices combine sensory signals into representations that take into account the signals’ causal structure (i.e., common versus independent sources) and their sensory reliabilities as predicted by Bayesian causal inference. The current study asks to what extent and how attentional mechanisms can actively control how sensory signals are combined for perceptual inference. In a pre- and postcueing paradigm, we presented observers with audiovisual signals at variable spatial disparities. Observers were precued to attend to auditory or visual modalities prior to stimulus presentation and postcued to report their perceived auditory or visual location. Combining psychophysics, functional magnetic resonance imaging (fMRI), and Bayesian modelling, we demonstrate that the brain moulds multisensory inference via two distinct mechanisms. Prestimulus attention to vision enhances the reliability and influence of visual inputs on spatial representations in visual and posterior parietal cortices. Poststimulus report determines how parietal cortices flexibly combine sensory estimates into spatial representations consistent with Bayesian causal inference. Our results show that distinct neural mechanisms control how signals are combined for perceptual inference at different levels of the cortical hierarchy

    Audiovisual Moments in Time:A large-scale annotated dataset of audiovisual actions

    Get PDF
    We present Audiovisual Moments in Time (AVMIT), a large-scale dataset of audiovisual action events. In an extensive annotation task 11 participants labelled a subset of 3-second audiovisual videos from the Moments in Time dataset (MIT). For each trial, participants assessed whether the labelled audiovisual action event was present and whether it was the most prominent feature of the video. The dataset includes the annotation of 57,177 audiovisual videos, each independently evaluated by 3 of 11 trained participants. From this initial collection, we created a curated test set of 16 distinct action classes, with 60 videos each (960 videos). We also offer 2 sets of pre-computed audiovisual feature embeddings, using VGGish/YamNet for audio data and VGG16/EfficientNetB0 for visual data, thereby lowering the barrier to entry for audiovisual DNN research. We explored the advantages of AVMIT annotations and feature embeddings to improve performance on audiovisual event recognition. A series of 6 Recurrent Neural Networks (RNNs) were trained on either AVMIT-filtered audiovisual events or modality-agnostic events from MIT, and then tested on our audiovisual test set. In all RNNs, top 1 accuracy was increased by 2.71-5.94% by training exclusively on audiovisual events, even outweighing a three-fold increase in training data. Additionally, we introduce the Supervised Audiovisual Correspondence (SAVC) task whereby a classifier must discern whether audio and visual streams correspond to the same action label. We trained 6 RNNs on the SAVC task, with or without AVMIT-filtering, to explore whether AVMIT is helpful for cross-modal learning. In all RNNs, accuracy improved by 2.09-19.16% with AVMIT-filtered data. We anticipate that the newly annotated AVMIT dataset will serve as a valuable resource for research and comparative experiments involving computational models and human participants, specifically when addressing research questions where audiovisual correspondence is of critical importance

    Comparing TMS perturbations to occipital and parietal cortices in concurrent TMS-fMRI studies-Methodological considerations

    Get PDF
    Neglect and hemianopia are two neuropsychological syndromes that are associated with reduced awareness for visual signals in patients' contralesional hemifield. They offer the unique possibility to dissociate the contributions of retino-geniculate and retino-colliculo circuitries in visual perception. Yet, insights from patient fMRI studies are limited by heterogeneity in lesion location and extent, long-term functional reorganization and behavioural compensation after stroke. Transcranial magnetic stimulation (TMS) has therefore been proposed as a complementary method to investigate the effect of transient perturbations on functional brain organization. This concurrent TMS-fMRI study applied TMS perturbation to occipital and parietal cortices with the aim to 'mimick' neglect and hemianopia. Based on the challenges and interpretational limitations of our own study we aim to provide tutorial guidance on how future studies should compare TMS to primary sensory and association areas that are governed by distinct computational principles, neural dynamics and functional architecture

    Distinct neural mechanisms and temporal constraints govern a cascade of audiotactile interactions

    Get PDF
    Synchrony is a crucial cue indicating whether sensory signals are caused by single or independent sources. In order to be integrated and produce multisensory behavioural benefits, signals must co-occur within a temporal integration window (TIW). Yet, the underlying neural determinants and mechanisms of integration across asynchronies remain unclear. This psychophysics and electroencephalography study investigated the temporal constraints of behavioural response facilitation and neural interactions for evoked response potentials (ERP), inter-trial coherence (ITC), and time-frequency (TF) power. Participants were presented with noise bursts, ‘taps to the face’, and their audiotactile (AT) combinations at seven asynchronies: 0, ±20, ±70, and ±500 ms. Behaviourally we observed an inverted U-shape function for AT response facilitation, which was maximal for synchronous AT stimulation and declined within a ≤70 ms TIW. For ERPs, we observed AT interactions at 110 ms for near-synchronous stimuli within a ≤20 ms TIW and at 400 ms within a ≤70 ms TIW consistent with behavioural response facilitation. By contrast, AT interactions for theta ITC and ERPs at 200 ms post-stimulus were selective for ±70 ms asynchrony, potentially mediated via phase resetting. Finally, interactions for induced theta power and alpha/beta power rebound emerged at 800-1100 ms across several asynchronies including even 500 ms auditory leading asynchrony. In sum, we observed neural interactions that were confined to or extending beyond the behavioural TIW or specific for ±70 ms asynchrony. This diversity of temporal profiles and constraints demonstrates that multisensory integration unfolds in a cascade of interactions that are governed by distinct neural mechanisms
    • …
    corecore