15 research outputs found
Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance
<div><p>The effects of visual speech (the moving image of the speaker’s face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker’s face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker’s face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information.</p></div
Schematic drawings of the three A/V stimuli used in the present study.
<p>aBe/vBe (audio /be/ and visual /be/), a monosyllabic sound /be/ spoken by a Japanese male speaker with the moving image of the same speaker’s face uttering /be/ (congruent visual stimulus); aBe/vGe (audio /be/ and visual /ge/), the same /be/ sound with the moving image of the same speaker’s face uttering /ge/ (incongruent visual stimulus); and aBe/vN (audio /be/ and visual /noise/), the same /be/ sound with visual noise created by applying a strong Gaussian filter of a PC software (Adobe<sup>®</sup> Photoshop) to a still image of the speaker’s face during the utterance of /be/. Total duration of each video clip was 3 s. The audio stimulus of the /be/ sound (duration about 180 ms) was presented starting at 1.4 s after the beginning of the visual stimulus. Audio stimuli were synchronized with the speaker’s mouth movement.</p
Example of the effects of contralateral noise on the waveform of the 40-Hz ASSR (Subject 3).
<p>A: left-ear stimulation with AM signal, B: right-ear stimulation with AM signal. ASSR with contralateral noise (red) are superimposed on waveforms without contralateral noise (black), mapped onto a flattened projection of the sensor array. Asterisks in A and B indicate the channels of maximum signals in each hemisphere. The responses of the channels of the maximum signals (indicated with asterisks) are magnified and shown in the inserts in the right column (C, D).</p
Effects of contralateral noise on the powers of ASSRs.
<p>The ASSR powers in the channels with the maximum responses were measured over each hemisphere for all measurement conditions (A: 20-Hz ASSR, B: 40-Hz ASSR) (see text for further details). LH: left hemisphere, RH: right hemisphere, RE: right ear, LE: left ear.</p
Relationship between the magnitudes of the visual speech effects and psychophysical responses.
<p>Normalized N100m latencies (ratio between N100m latencies under the aBe/vBe or aBe/vGe condition and control [aBe/vN] condition) are plotted as a function of confusion responses in phoneme perception (psychophysical responses other than /be/). Theoretically, correlation between psychophysical response and visual speech effects on the N100m latencies should result in positive and negative correlations for incongruent (upper panels) and congruent (lower panels) A/V conditions, respectively. However, no such correlations were observed. Thin line in each figure indicates the linear regression line.</p
Typical examples of waveforms of the auditory evoked fields (AEFs) under the three A/V conditions (Subject 1).
<p>Asterisks in A, C, and D indicate N100m. A: Superimposed waveforms recorded from all sensors located in the right hemisphere (black line) and root mean square (RMS) waveform (red line). B: Iso-field map. C and D: RMS waveforms calculated from all sensors in the right (C) and left hemispheres (D) for the three A/V conditions. Black, red, and blue waveforms indicate aBe/vN, aBe/vBe, and aBe/vGe, respectively.</p
Psychophysical responses to the three A/V stimuli during the MEG measurements.
<p>Subjects were asked to judge “what the A/V stimulus was heard as” by pushing the response buttons. The rate (percentage) at which the presented /be/ sound was perceived as different to /be/ was plotted for the three different A/V conditions. Open circles indicate individual data. Average and standard error values are represented by filled circles and bars, respectively. Statistical significance of differences was determined by one-way repeated measures analysis of variance with Bonferroni post-hoc analysis. Asterisks indicate significant differences (p<0.001). As expected, the confusion response (the rate at which /be/ sound was perceived as different to /be/) was significantly higher under the incongruent McGurk visual condition (visual /ge/) and was significantly lower under the congruent visual condition (visual /be/).</p
Ratios of ASSR power under conditions with/without contralateral noise.
<p>The average ratios of ASSR power are plotted with standard errors (error bars). Suppression of the 40-Hz ASSR was significantly greater than that of the 20-Hz ASSR (p<0.05) by three-way ANOVA (see text for further details). LH: left hemisphere, RH: right hemisphere, RE: right ear, LE: left ear.</p
Relationship between lip-reading effects on N100m latencies under congruent and incongruent A/V conditions.
<p>Relationships between the normalized N100m latency in each subject for congruent (aBe/vBe) A/V stimuli (ratio between N100m latency under the aBe/vBe and aBe/vN conditions) and incongruent (aBe/vGe) A/V stimuli (ratio between N100m latency under the aBe/vGe and aBe/vN conditions) are represented for the left and right hemispheres. Thin line in each figure indicates the linear regression line.</p
Typical example of the effects of visual speech (simultaneous [A/V offset = 0] condition) on the waveforms of AEFs observed in the left hemisphere.
<p><b>a</b>: Superimposed waveforms recorded from all sensors located in the left hemisphere for control condition (/ge/ sound with visual noise: black dotted lines) and for A/V 0 condition (/ge/ sound presented with visual /be/ without no lag: red line). <b>b</b>: Root mean square (RMS) waveforms calculated from all sensors in the left hemisphere for control condition (/ge/ sound with visual noise: black dotted lines) and for A/V 0 condition (/ge/ sound presented with visual /be/ without no lag: red line).</p