1,727 research outputs found

    Contributions of local speech encoding and functional connectivity to audio-visual speech perception

    Get PDF
    Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments

    Leading and following: Noise differently affects semantic and acoustic processing during naturalistic speech comprehension

    Get PDF
    Despite the distortion of speech signals caused by unavoidable noise in daily life, our ability to comprehend speech in noisy environments is relatively stable. However, the neural mechanisms underlying reliable speech-in-noise comprehension remain to be elucidated. The present study investigated the neural tracking of acoustic and semantic speech information during noisy naturalistic speech comprehension. Participants listened to narrative audio recordings mixed with spectrally matched stationary noise at three signal-to-ratio (SNR) levels (no noise, 3 dB, -3 dB), and 60-channel electroencephalography (EEG) signals were recorded. A temporal response function (TRF) method was employed to derive event-related-like responses to the continuous speech stream at both the acoustic and the semantic levels. Whereas the amplitude envelope of the naturalistic speech was taken as the acoustic feature, word entropy and word surprisal were extracted via the natural language processing method as two semantic features. Theta-band frontocentral TRF responses to the acoustic feature were observed at around 400 ms following speech fluctuation onset over all three SNR levels, and the response latencies were more delayed with increasing noise. Delta-band frontal TRF responses to the semantic feature of word entropy were observed at around 200 to 600 ms leading to speech fluctuation onset over all three SNR levels. The response latencies became more leading with increasing noise and decreasing speech comprehension and intelligibility. While the following responses to speech acoustics were consistent with previous studies, our study revealed the robustness of leading responses to speech semantics, which suggests a possible predictive mechanism at the semantic level for maintaining reliable speech comprehension in noisy environments

    Applauding with Closed Hands: Neural Signature of Action-Sentence Compatibility Effects

    Get PDF
    BACKGROUND: Behavioral studies have provided evidence for an action-sentence compatibility effect (ACE) that suggests a coupling of motor mechanisms and action-sentence comprehension. When both processes are concurrent, the action sentence primes the actual movement, and simultaneously, the action affects comprehension. The aim of the present study was to investigate brain markers of bidirectional impact of language comprehension and motor processes. METHODOLOGY/PRINCIPAL FINDINGS: Participants listened to sentences describing an action that involved an open hand, a closed hand, or no manual action. Each participant was asked to press a button to indicate his/her understanding of the sentence. Each participant was assigned a hand-shape, either closed or open, which had to be used to activate the button. There were two groups (depending on the assigned hand-shape) and three categories (compatible, incompatible and neutral) defined according to the compatibility between the response and the sentence. ACEs were found in both groups. Brain markers of semantic processing exhibited an N400-like component around the Cz electrode position. This component distinguishes between compatible and incompatible, with a greater negative deflection for incompatible. Motor response elicited a motor potential (MP) and a re-afferent potential (RAP), which are both enhanced in the compatible condition. CONCLUSIONS/SIGNIFICANCE: The present findings provide the first ACE cortical measurements of semantic processing and the motor response. N400-like effects suggest that incompatibility with motor processes interferes in sentence comprehension in a semantic fashion. Modulation of motor potentials (MP and RAP) revealed a multimodal semantic facilitation of the motor response. Both results provide neural evidence of an action-sentence bidirectional relationship. Our results suggest that ACE is not an epiphenomenal post-sentence comprehension process. In contrast, motor-language integration occurring during the verb onset supports a genuine and ongoing brain motor-language interaction

    Predictive Top-Down Integration of Prior Knowledge during Speech Perception

    Get PDF
    A striking feature of human perception is that our subjective experience depends not only on sensory information from the environment but also on our prior knowledge or expectations. The precise mechanisms by which sensory information and prior knowledge are integrated remain unclear, with longstanding disagreement concerning whether integration is strictly feedforward or whether higher-level knowledge influences sensory processing through feedback connections. Here we used concurrent EEG and MEG recordings to determine how sensory information and prior knowledge are integrated in the brain during speech perception. We manipulated listeners' prior knowledge of speech content by presenting matching, mismatching, or neutral written text before a degraded (noise-vocoded) spoken word. When speech conformed to prior knowledge, subjective perceptual clarity was enhanced. This enhancement in clarity was associated with a spatiotemporal profile of brain activity uniquely consistent with a feedback process: activity in the inferior frontal gyrus was modulated by prior knowledge before activity in lower-level sensory regions of the superior temporal gyrus. In parallel, we parametrically varied the level of speech degradation, and therefore the amount of sensory detail, so that changes in neural responses attributable to sensory information and prior knowledge could be directly compared. Although sensory detail and prior knowledge both enhanced speech clarity, they had an opposite influence on the evoked response in the superior temporal gyrus. We argue that these data are best explained within the framework of predictive coding in which sensory activity is compared with top-down predictions and only unexplained activity propagated through the cortical hierarchy

    Neural dynamics of selective attention to speech in noise

    Get PDF
    This thesis investigates how the neural system instantiates selective attention to speech in challenging acoustic conditions, such as spectral degradation and the presence of background noise. Four studies using behavioural measures, magneto- and electroencephalography (M/EEG) recordings were conducted in younger (20–30 years) and older participants (60–80 years). The overall results can be summarized as follows. An EEG experiment demonstrated that slow negative potentials reflect participants’ enhanced allocation of attention when they are faced with more degraded acoustics. This basic mechanism of attention allocation was preserved at an older age. A follow-up experiment in younger listeners indicated that attention allocation can be further enhanced in a context of increased task-relevance through monetary incentives. A subsequent study focused on brain oscillatory dynamics in a demanding speech comprehension task. The power of neural alpha oscillations (~10 Hz) reflected a decrease in demands on attention with increasing acoustic detail and critically also with increasing predictiveness of the upcoming speech content. Older listeners’ behavioural responses and alpha power dynamics were stronger affected by acoustic detail compared with younger listeners, indicating that selective attention at an older age is particularly dependent on the sensory input signal. An additional analysis of listeners’ neural phase-locking to the temporal envelopes of attended speech and unattended background speech revealed that younger and older listeners show a similar segregation of attended and unattended speech on a neural level. A dichotic listening experiment in the MEG aimed at investigating how neural alpha oscillations support selective attention to speech. Lateralized alpha power modulations in parietal and auditory cortex regions predicted listeners’ focus of attention (i.e., left vs right). This suggests that alpha oscillations implement an attentional filter mechanism to enhance the signal and to suppress noise. A final behavioural study asked whether acoustic and semantic aspects of task-irrelevant speech determine how much it interferes with attention to task-relevant speech. Results demonstrated that younger and older adults were more distracted when acoustic detail of irrelevant speech was enhanced, whereas predictiveness of irrelevant speech had no effect. All findings of this thesis are integrated in an initial framework for the role of attention for speech comprehension under demanding acoustic conditions

    Rapid invisible frequency tagging reveals nonlinear integration of auditory and visual information

    Get PDF
    During communication in real-life settings, the brain integrates information from auditory and visual modalities to form a unified percept of our environment. In the current magnetoencephalography (MEG) study, we used rapid invisible frequency tagging (RIFT) to generate steady-state evoked fields and investigated the integration of audiovisual information in a semantic context. We presented participants with videos of an actress uttering action verbs (auditory; tagged at 61 Hz) accompanied by a gesture (visual; tagged at 68 Hz, using a projector with a 1440 Hz refresh rate). Integration ease was manipulated by auditory factors (clear/degraded speech) and visual factors (congruent/incongruent gesture). We identified MEG spectral peaks at the individual (61/68 Hz) tagging frequencies. We furthermore observed a peak at the intermodulation frequency of the auditory and visually tagged signals (fvisual – fauditory = 7 Hz), specifically when integration was easiest (i.e., when speech was clear and accompanied by a congruent gesture). This intermodulation peak is a signature of nonlinear audiovisual integration, and was strongest in left inferior frontal gyrus and left temporal regions; areas known to be involved in speech-gesture integration. The enhanced power at the intermodulation frequency thus reflects the ease of integration and demonstrates that speech-gesture information interacts in higher-order language areas. Furthermore, we provide a proof-of-principle of the use of RIFT to study the integration of audiovisual stimuli, in relation to, for instance, semantic context

    Hearing in dementia: defining deficits and assessing impact

    Get PDF
    The association between hearing impairment and dementia has emerged as a major public health challenge, with significant opportunities for earlier diagnosis, treatment and prevention. However, the nature of this association has not been defined. We hear with our brains, particularly within the complex soundscapes of everyday life: neurodegenerative pathologies target the auditory brain and are therefore predicted to damage hearing function early and profoundly. Here I present evidence for this proposition, based on structural and functional features of auditory brain organisation that confer vulnerability to neurodegeneration, the extensive, reciprocal interplay between ‘peripheral’ and ‘central’ hearing dysfunction, and recently characterised auditory signatures of canonical neurodegenerative dementias (Alzheimer’s disease and frontotemporal dementia). In chapter 3, I examine pure tone audiometric thresholds in AD and FTD syndromes and explore the functional interplay between the auditory brain and auditory periphery by assessing the contribution of auditory cognitive factors on pure tone detection. In chapter 4, I develop this further by examining the processing of degraded speech signals, leveraging the increased importance of top-down integrative and predictive mechanisms on resolving impoverished bottom-up sensory encoding. In chapter 5, I use a more discrete test of phonological processing to focus in on a specific brain region that is an early target in logopenic aphasia, to explore the potential of auditory cognitive tests as disease specific functional biomarkers. Finally, in chapter 6, I use auditory symptom questionnaires to capture real-world hearing in daily life amongst patients with dementia as well as their carers and measure how this correlates with audiometric performance and degraded speech processing. I call for a clinical assessment of real-world hearing in these diseases that moves beyond pure tone perception to the development of novel auditory ‘cognitive stress tests’ and proximity markers for the early diagnosis of dementia and management strategies that harness retained auditory plasticity

    Brain activity underlying the recovery of meaning from degraded speech: a functional near-infrared spectroscopy (fNIRS) study

    Get PDF
    The purpose of this study was to establish whether functional near-infrared spectroscopy (fNIRS), an emerging brain-imaging technique based on optical principles, is suitable for studying the brain activity that underlies effortful listening. In an event-related fNIRS experiment, normally-hearing adults listened to sentences that were either clear or degraded (noise vocoded). These sentences were presented simultaneously with a non-speech distractor, and on each trial participants were instructed to attend either to the speech or to the distractor. The primary region of interest for the fNIRS measurements was the left inferior frontal gyrus (LIFG), a cortical region involved in higher-order language processing. The fNIRS results confirmed findings previously reported in the functional magnetic resonance imaging (fMRI) literature. Firstly, the LIFG exhibited an elevated response to degraded versus clear speech, but only when attention was directed towards the speech. This attention-dependent increase in frontal brain activation may be a neural marker for effortful listening. Secondly, during attentive listening to degraded speech, the haemodynamic response peaked significantly later in the LIFG than in superior temporal cortex, possibly reflecting the engagement of working memory to help reconstruct the meaning of degraded sentences. The homologous region in the right hemisphere may play an equivalent role to the LIFG in some left-handed individuals. In conclusion, fNIRS holds promise as a flexible tool to examine the neural signature of effortful listening
    • …
    corecore