902 research outputs found

    Evaluation of auditory-visual speech perception in individuals diagnosed with dementia of the Alzheimer\u27s type

    Get PDF
    Auditory-visual speech perception testing was completed using wordandconsonant-level stimuli in individuals with known degrees of dementia of theAlzheimer’s type. The correlations with the cognitive measures and the speechperception measures (A-only, V-only, AV, VE or AE) did not reveal significantrelationships

    The Neurobiology of Audiovisual Integration: A Voxel-Based Lesion Symptom Mapping Study

    Get PDF
    abstract: Audiovisual (AV) integration is a fundamental component of face-to-face communication. Visual cues generally aid auditory comprehension of communicative intent through our innate ability to “fuse” auditory and visual information. However, our ability for multisensory integration can be affected by damage to the brain. Previous neuroimaging studies have indicated the superior temporal sulcus (STS) as the center for AV integration, while others suggest inferior frontal and motor regions. However, few studies have analyzed the effect of stroke or other brain damage on multisensory integration in humans. The present study examines the effect of lesion location on auditory and AV speech perception through behavioral and structural imaging methodologies in 41 left-hemisphere participants with chronic focal cerebral damage. Participants completed two behavioral tasks of speech perception: an auditory speech perception task and a classic McGurk paradigm measuring congruent (auditory and visual stimuli match) and incongruent (auditory and visual stimuli do not match, creating a “fused” percept of a novel stimulus) AV speech perception. Overall, participants performed well above chance on both tasks. Voxel-based lesion symptom mapping (VLSM) across all 41 participants identified several regions as critical for speech perception depending on trial type. Heschl’s gyrus and the supramarginal gyrus were identified as critical for auditory speech perception, the basal ganglia was critical for speech perception in AV congruent trials, and the middle temporal gyrus/STS were critical in AV incongruent trials. VLSM analyses of the AV incongruent trials were used to further clarify the origin of “errors”, i.e. lack of fusion. Auditory capture (auditory stimulus) responses were attributed to visual processing deficits caused by lesions in the posterior temporal lobe, whereas visual capture (visual stimulus) responses were attributed to lesions in the anterior temporal cortex, including the temporal pole, which is widely considered to be an amodal semantic hub. The implication of anterior temporal regions in AV integration is novel and warrants further study. The behavioral and VLSM results are discussed in relation to previous neuroimaging and case-study evidence; broadly, our findings coincide with previous work indicating that multisensory superior temporal cortex, not frontal motor circuits, are critical for AV integration.Dissertation/ThesisMasters Thesis Communication Disorders 201

    The impact of automatic exaggeration of the visual articulatory features of a talker on the intelligibility of spectrally distorted speech

    Get PDF
    Visual speech information plays a key role in supporting speech perception, especially when acoustic features are distorted or inaccessible. Recent research suggests that for spectrally distorted speech, the use of visual speech in auditory training improves not only subjects’ audiovisual speech recognition, but also their subsequent auditory-only speech recognition. Visual speech cues, however, can be affected by a number of facial visual signals that vary across talkers, such as lip emphasis and speaking style. In a previous study, we enhanced the visual speech videos used in perception training by automatically tracking and colouring a talker’s lips. This improved the subjects’ audiovisual and subsequent auditory speech recognition compared with those who were trained via unmodified videos or audio-only methods. In this paper, we report on two issues related to automatic exaggeration of the movement of the lips/ mouth area. First, we investigate subjects’ ability to adapt to the conflict between the articulation energy in the visual signals and the vocal effort in the acoustic signals (since the acoustic signals remained unexaggerated). Second, we have examined whether or not this visual exaggeration can improve the subjects’ performance of auditory and audiovisual speech recognition when used in perception training. To test this concept, we used spectrally distorted speech to train groups of listeners using four different training regimes: (1) audio only, (2) audiovisual, (3) audiovisual visually exaggerated, and (4) audiovisual visually exaggerated and lip-coloured. We used spectrally distorted speech (cochlear-implant-simulated speech) because the longer-term aim of our work is to employ these concepts in a training system for cochlear-implant (CI) users. The results suggest that after exposure to visually exaggerated speech, listeners had the ability to adapt alongside the conflicting audiovisual signals. In addition, subjects trained with enhanced visual cues (regimes 3 and 4) achieved better audiovisual recognition for a number of phoneme classes than those who were trained with unmodified visual speech (regime 2). There was no evidence of an improvement in the subsequent audio-only listening skills, however. The subjects’ adaptation to the conflicting audiovisual signals may have slowed down auditory perceptual learning, and impeded the ability of the visual speech to improve the training gains

    Lexical and audiovisual bases of perceptual adaptation in speech

    Get PDF

    Models of Speech Processing

    Get PDF
    One of the fundamental questions about language is how listeners map the acoustic signal onto syllables, words, and sentences, resulting in understanding of speech. For normal listeners, this mapping is so effortless that one rarely stops to consider just how it takes place. However, studies of speech have shown that this acoustic signal contains a great deal of underlying complexity. A number of competing models seek to explain how these intricate processes work. Such models have often narrowed the problem to mapping the speech signal onto isolated words, setting aside the complexity of segmenting continuous speech. Continuous speech has presented a significant challenge for many models because of the high variability of the signal and the difficulties involved in resolving the signal into individual words. The importance of understanding speech becomes particularly apparent when neurological disease affects this seemingly basic ability. Lesion studies have explored impairments of speech sound processing to determine whether deficits occur in perceptual analysis of acoustic-phonetic information or in stored abstract phonological representations (e.g., Basso, Casati,& Vignolo, 1977; Blumstein, Cooper, Zurif,& Caramazza, 1977). Furthermore, researchers have attempted to determine in what ways underlying phonological/phonetic impairments may contribute to auditory comprehension deficits (Blumstein, Baker, & Goodglass, 1977). In this chapter, we discuss several psycholinguistic models of word recognition (the process of mapping the speech signal onto the lexicon), and outline how components of such models might correspond to the functional anatomy of the brain. We will also relate evidence from brain lesion and brain activation studies to components of such models. We then present some approaches that deal with speech perception more generally, and touch on a few current topics of debate.National Institutes of Health under grant NIH DC R01–3378 to the senior author (SLS

    Where on the face do we look during phonemic restoration: An eye-tracking study

    Get PDF
    Face to face communication typically involves audio and visual components to the speech signal. To examine the effect of task demands on gaze patterns in response to a speaking face, adults participated in two eye-tracking experiments with an audiovisual (articulatory information from the mouth was visible) and a pixelated condition (articulatory information was not visible). Further, task demands were manipulated by having listeners respond in a passive (no response) or an active (button press response) context. The active experiment required participants to discriminate between speech stimuli and was designed to mimic environmental situations which require one to use visual information to disambiguate the speaker’s message, simulating different listening conditions in real-world settings. Stimuli included a clear exemplar of the syllable /ba/ and a second exemplar in which the formant initial consonant was reduced creating an /a/−like consonant. Consistent with our hypothesis, results revealed that the greatest fixations to the mouth were present in the audiovisual active experiment and visual articulatory information led to a phonemic restoration effect for the /a/ speech token. In the pixelated condition, participants fixated on the eyes, and discrimination of the deviant token within the active experiment was significantly greater than the audiovisual condition. These results suggest that when required to disambiguate changes in speech, adults may look to the mouth for additional cues to support processing when it is available
    • 

    corecore