Search CORE

902 research outputs found

Evaluation of auditory-visual speech perception in individuals diagnosed with dementia of the Alzheimer\u27s type

Author: Firtel Alyse Paige
Publication venue: Digital Commons@Becker
Publication date: 01/01/2011
Field of study

Auditory-visual speech perception testing was completed using wordandconsonant-level stimuli in individuals with known degrees of dementia of theAlzheimer’s type. The correlations with the cognitive measures and the speechperception measures (A-only, V-only, AV, VE or AE) did not reveal significantrelationships

Digital Commons@Becker

The Neurobiology of Audiovisual Integration: A Voxel-Based Lesion Symptom Mapping Study

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Audiovisual (AV) integration is a fundamental component of face-to-face communication. Visual cues generally aid auditory comprehension of communicative intent through our innate ability to “fuse” auditory and visual information. However, our ability for multisensory integration can be affected by damage to the brain. Previous neuroimaging studies have indicated the superior temporal sulcus (STS) as the center for AV integration, while others suggest inferior frontal and motor regions. However, few studies have analyzed the effect of stroke or other brain damage on multisensory integration in humans. The present study examines the effect of lesion location on auditory and AV speech perception through behavioral and structural imaging methodologies in 41 left-hemisphere participants with chronic focal cerebral damage. Participants completed two behavioral tasks of speech perception: an auditory speech perception task and a classic McGurk paradigm measuring congruent (auditory and visual stimuli match) and incongruent (auditory and visual stimuli do not match, creating a “fused” percept of a novel stimulus) AV speech perception. Overall, participants performed well above chance on both tasks. Voxel-based lesion symptom mapping (VLSM) across all 41 participants identified several regions as critical for speech perception depending on trial type. Heschl’s gyrus and the supramarginal gyrus were identified as critical for auditory speech perception, the basal ganglia was critical for speech perception in AV congruent trials, and the middle temporal gyrus/STS were critical in AV incongruent trials. VLSM analyses of the AV incongruent trials were used to further clarify the origin of “errors”, i.e. lack of fusion. Auditory capture (auditory stimulus) responses were attributed to visual processing deficits caused by lesions in the posterior temporal lobe, whereas visual capture (visual stimulus) responses were attributed to lesions in the anterior temporal cortex, including the temporal pole, which is widely considered to be an amodal semantic hub. The implication of anterior temporal regions in AV integration is novel and warrants further study. The behavioral and VLSM results are discussed in relation to previous neuroimaging and case-study evidence; broadly, our findings coincide with previous work indicating that multisensory superior temporal cortex, not frontal motor circuits, are critical for AV integration.Dissertation/ThesisMasters Thesis Communication Disorders 201

ASU Digital Repository

The impact of automatic exaggeration of the visual articulatory features of a talker on the intelligibility of spectrally distorted speech

Author: Alghamdi N.
Barker J.
Brown G.J.
Maddock S.
Publication venue: 'Elsevier BV'
Publication date: 31/08/2017
Field of study

Visual speech information plays a key role in supporting speech perception, especially when acoustic features are distorted or inaccessible. Recent research suggests that for spectrally distorted speech, the use of visual speech in auditory training improves not only subjects’ audiovisual speech recognition, but also their subsequent auditory-only speech recognition. Visual speech cues, however, can be affected by a number of facial visual signals that vary across talkers, such as lip emphasis and speaking style. In a previous study, we enhanced the visual speech videos used in perception training by automatically tracking and colouring a talker’s lips. This improved the subjects’ audiovisual and subsequent auditory speech recognition compared with those who were trained via unmodified videos or audio-only methods. In this paper, we report on two issues related to automatic exaggeration of the movement of the lips/ mouth area. First, we investigate subjects’ ability to adapt to the conflict between the articulation energy in the visual signals and the vocal effort in the acoustic signals (since the acoustic signals remained unexaggerated). Second, we have examined whether or not this visual exaggeration can improve the subjects’ performance of auditory and audiovisual speech recognition when used in perception training. To test this concept, we used spectrally distorted speech to train groups of listeners using four different training regimes: (1) audio only, (2) audiovisual, (3) audiovisual visually exaggerated, and (4) audiovisual visually exaggerated and lip-coloured. We used spectrally distorted speech (cochlear-implant-simulated speech) because the longer-term aim of our work is to employ these concepts in a training system for cochlear-implant (CI) users. The results suggest that after exposure to visually exaggerated speech, listeners had the ability to adapt alongside the conflicting audiovisual signals. In addition, subjects trained with enhanced visual cues (regimes 3 and 4) achieved better audiovisual recognition for a number of phoneme classes than those who were trained with unmodified visual speech (regime 2). There was no evidence of an improvement in the subsequent audio-only listening skills, however. The subjects’ adaptation to the conflicting audiovisual signals may have slowed down auditory perceptual learning, and impeded the ability of the visual speech to improve the training gains

White Rose Research Online

Recommended from our members

The influence of visual information on the perception of auditory speech in quiet and noise

Author: Stacey JE
Publication venue
Publication date: 01/09/2019
Field of study

Audio-visual (AV) integration involves the combining of auditory and visual information which is often required for everyday face to face communication. Speech perception becomes difficult in situations when it is harder to hear the voice of the speaker. When the ability to identify speech in noise is reduced, people with normal hearing improve with the addition of visual information; when they can see the talker's face (Sumby & Pollack, 1954). Exactly how visual information is used in background noise is not well understood. The goal of the thesis was to understand the influence of visual information on auditory speech perception using a famous measure of AV integration (The McGurk effect). Four experiments are reported which aimed to a) explore the use of the McGurk effect as a measure of AV integration, b) understand the influence of visual information in quiet and noise, and how auditory and visual information interact when one or both of the modalities is degraded, and c) provide insight into theories of AV integration through using behavioural measures. The main findings were that 1) instances of the McGurk effect are influenced by the type of task used, and vary according to different stimuli and participants, 2) The McGurk effect can still be perceived even when the visual stimulus is highly degraded although the illusion decreases as visual blur increases, 3) fixating the mouth is not necessary for perceiving the McGurk effect, 4) Visual benefit increases as the clarity of the visual stimulus increases. Overall, the findings suggest that visual information is of most benefit when it is clear, looking at the mouth is not necessary for AV integration in quiet but increases the likelihood of successful integration when speech is presented in auditory noise

Nottingham Trent Institutional Repository (IRep)

Lexical and audiovisual bases of perceptual adaptation in speech

Author: Ullas Shruti
Publication venue: 'University of Maastricht'
Publication date: 01/01/2020
Field of study

Maastricht University Research Portal

Models of Speech Processing

Author: Burton Martha W.
Grosvald Michael
Small Steven L.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2015
Field of study

One of the fundamental questions about language is how listeners map the acoustic signal onto syllables, words, and sentences, resulting in understanding of speech. For normal listeners, this mapping is so effortless that one rarely stops to consider just how it takes place. However, studies of speech have shown that this acoustic signal contains a great deal of underlying complexity. A number of competing models seek to explain how these intricate processes work. Such models have often narrowed the problem to mapping the speech signal onto isolated words, setting aside the complexity of segmenting continuous speech. Continuous speech has presented a significant challenge for many models because of the high variability of the signal and the difficulties involved in resolving the signal into individual words. The importance of understanding speech becomes particularly apparent when neurological disease affects this seemingly basic ability. Lesion studies have explored impairments of speech sound processing to determine whether deficits occur in perceptual analysis of acoustic-phonetic information or in stored abstract phonological representations (e.g., Basso, Casati,& Vignolo, 1977; Blumstein, Cooper, Zurif,& Caramazza, 1977). Furthermore, researchers have attempted to determine in what ways underlying phonological/phonetic impairments may contribute to auditory comprehension deficits (Blumstein, Baker, & Goodglass, 1977). In this chapter, we discuss several psycholinguistic models of word recognition (the process of mapping the speech signal onto the lexicon), and outline how components of such models might correspond to the functional anatomy of the brain. We will also relate evidence from brain lesion and brain activation studies to components of such models. We then present some approaches that deal with speech perception more generally, and touch on a few current topics of debate.National Institutes of Health under grant NIH DC R01–3378 to the senior author (SLS

Qatar University Institutional Repository

Coded excitation and sub-band processing for blood velocity estmation in medical ultrasound

Author: Gran Fredrik
Jensen Jørgen Arendt
Udesen Jesper
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2007
Field of study

Online Research Database In Technology

Where on the face do we look during phonemic restoration: An eye-tracking study

Author: Alisa Baron
Daniel Kleinman
Joseph Molski
Julia Irwin
Julia Irwin
Luca Campanelli
Nicole Landi
Nicole Landi
Vanessa Harwood
Publication venue: 'Frontiers Media SA'
Publication date: 01/05/2023
Field of study

Face to face communication typically involves audio and visual components to the speech signal. To examine the effect of task demands on gaze patterns in response to a speaking face, adults participated in two eye-tracking experiments with an audiovisual (articulatory information from the mouth was visible) and a pixelated condition (articulatory information was not visible). Further, task demands were manipulated by having listeners respond in a passive (no response) or an active (button press response) context. The active experiment required participants to discriminate between speech stimuli and was designed to mimic environmental situations which require one to use visual information to disambiguate the speaker’s message, simulating different listening conditions in real-world settings. Stimuli included a clear exemplar of the syllable /ba/ and a second exemplar in which the formant initial consonant was reduced creating an /a/−like consonant. Consistent with our hypothesis, results revealed that the greatest fixations to the mouth were present in the audiovisual active experiment and visual articulatory information led to a phonemic restoration effect for the /a/ speech token. In the pixelated condition, participants fixated on the eyes, and discrimination of the deviant token within the active experiment was significantly greater than the audiovisual condition. These results suggest that when required to disambiguate changes in speech, adults may look to the mouth for additional cues to support processing when it is available

Directory of Open Access Journals

Neurophysiological and clinical investigation of phonological input processing in non-brain damaged individuals and patients with aphasia

Author: Aerts Annelies
Publication venue: Ghent University. Faculty of Medicine and Health Sciences
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography