593,629 research outputs found
Short-and medium-term plasticity for speaker adaptation seem to be independent
The author wishes to thank James McQueen and
Elizabeth Johnson for comments made on an earlier
drafts of this paper.In a classic paper, Ladefoged and Broadbent [1] showed that
listeners adapt to speakers based on short-term exposure of a
single phrase. Recently, Norris, McQueen, and Cutler [2]
presented evidence for a lexically conditioned medium-term
adaptation to a particular speaker based on an exposure of 40
critical words among 200 items. In two experiments, I
investigated whether there is a connection between the two
findings. To this end, a vowel-normalization paradigm
(similar to [1]) was used with a carrier phrase that consisted of
either words or nonwords. The range of the second formant
was manipulated and this affected the perception of a target
vowel in a compensatory fashion: A low F2-range made it
more likely that a target vowel was perceived as a front vowel,
that is, with an inherently high F2. Manipulation of the lexical
status of the carrier phrase, however, did not affect vowel
normalization. In contrast, the range of vowels in the carrier
phrase did influence vowel normalization. If the carrier
phrase consisted of high-front vowels only, vowel categories
shifted only for high-front vowels. This may indicate that the
short-term and medium-term adaptations are brought about by
different mechanisms.peer-reviewe
Lexically-driven perceptual adjustments of vowel categories
We investigated the plasticity of vowel categories in a perceptual learning paradigm in which listeners are encouraged to use lexical knowledge to adjust their interpretation of ambiguous speech sounds. We tested whether this kind of learning occurs for vowels, and whether it generalises to the perception of other vowels. In Experiments 1 and 2, Dutch listeners were exposed during a lexical decision task to ambiguous vowels, midway between [i] and [e], in lexical contexts biasing interpretation of those vowels either towards /i/ or towards /e/. The effect of this exposure was tested in a subsequent phonetic-categorisation task. Lexically-driven perceptual adjustments were observed: Listeners exposed to the ambiguous vowels in /i/-biased contexts identified more sounds on an [i]-[e] test continuum as /i/ than those who heard the ambiguous vowels in /e/-biased contexts. Generalisation to other contrasts was weak and occurred more strongly for a distant vowel contrast (/Ī± / vs. / Ķ» /,) than for a near contrast (I/ vs. / Īµ /). In Experiment 3, spectral filters based on the difference between the exposure [i] and [e] sounds were applied to test stimuli from all three of the contrasts. Identification data of these filtered stimuli suggest that generalisation of learning across vowels does not depend on overall spectral similarity between exposure and test vowel contrasts.peer-reviewe
Conflict monitoring in speech processing: an fMRI study of error detection in speech production and perception
To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated whether internal verbal monitoring takes place through the speech perception system, as proposed by perception-based theories of speech monitoring, or whether mechanisms independent of perception are applied, as proposed by production-based theories of speech monitoring. With the use of fMRI during a tongue twister task we observed that error detection in internal speech during noise-masked overt speech production and error detection in speech perception both recruit the same neural network, which includes pre-supplementary motor area (pre-SMA), dorsal anterior cingulate cortex (dACC), anterior insula (AI), and inferior frontal gyrus (IFG). Although production and perception recruit similar areas, as proposed by perception-based accounts, we did not find activation in superior temporal areas (which are typically associated with speech perception) during internal speech monitoring in speech production as hypothesized by these accounts. On the contrary, results are highly compatible with a domain general approach to speech monitoring, by which internal speech monitoring takes place through detection of conflict between response options, which is subsequently resolved by a domain general executive center (e.g., the ACC)
Speech monitoring and phonologically-mediated eye gaze in language perception and production: a comparison using printed word eye-tracking
The Perceptual Loop Theory of speech monitoring assumes that speakers routinely inspect their inner speech. In contrast, Huettig and Hartsuiker (2010) observed that listening to one's own speech during language production drives eye-movements to phonologically related printed words with a similar time-course as listening to someone else's speech does in speech perception experiments. This suggests that speakers use their speech perception system to listen to their own overt speech, but not to their inner speech. However, a direct comparison between production and perception with the same stimuli and participants is lacking so far. The current printed word eye-tracking experiment therefore used a within-subjects design, combining production and perception. Displays showed four words, of which one, the target, either had to be named or was presented auditorily. Accompanying words were phonologically related, semantically related, or unrelated to the target. There were small increases in looks to phonological competitors with a similar time-course in both production and perception. Phonological effects in perception however lasted longer and had a much larger magnitude. We conjecture that this difference is related to a difference in predictability of one's own and someone else's speech, which in turn has consequences for lexical competition in other-perception and possibly suppression of activation in self-perception
Enhanced amplitude modulations contribute to the Lombard intelligibility benefit: Evidence from the Nijmegen Corpus of Lombard Speech
Speakers adjust their voice when talking in noise, which is known as Lombard speech. These acoustic adjustments facilitate speech comprehension in noise relative to plain speech (i.e., speech produced in quiet). However, exactly which characteristics of Lombard speech drive this intelligibility benefit in noise remains unclear. This study assessed the contribution of enhanced amplitude modulations to the Lombard speech intelligibility benefit by demonstrating that (1) native speakers of Dutch in the Nijmegen Corpus of Lombard Speech (NiCLS) produce more pronounced amplitude modulations in noise vs. in quiet; (2) more enhanced amplitude modulations correlate positively with intelligibility in a speech-in-noise perception experiment; (3) transplanting the amplitude modulations from Lombard speech onto plain speech leads to an intelligibility improvement, suggesting that enhanced amplitude modulations in Lombard speech contribute towards intelligibility in noise. Results are discussed in light of recent neurobiological models of speech perception with reference to neural oscillators phase-locking to the amplitude modulations in speech, guiding the processing of speech
Engaging the articulators enhances perception of concordant visible speech movements
PURPOSE
This study aimed to test whether (and how) somatosensory feedback signals from the vocal tract affect concurrent unimodal visual speech perception.
METHOD
Participants discriminated pairs of silent visual utterances of vowels under 3 experimental conditions: (a) normal (baseline) and while holding either (b) a bite block or (c) a lip tube in their mouths. To test the specificity of somatosensory-visual interactions during perception, we assessed discrimination of vowel contrasts optically distinguished based on their mandibular (English /É/-/Ʀ/) or labial (English /u/-French /u/) postures. In addition, we assessed perception of each contrast using dynamically articulating videos and static (single-frame) images of each gesture (at vowel midpoint).
RESULTS
Engaging the jaw selectively facilitated perception of the dynamic gestures optically distinct in terms of jaw height, whereas engaging the lips selectively facilitated perception of the dynamic gestures optically distinct in terms of their degree of lip compression and protrusion. Thus, participants perceived visible speech movements in relation to the configuration and shape of their own vocal tract (and possibly their ability to produce covert vowel production-like movements). In contrast, engaging the articulators had no effect when the speaking faces did not move, suggesting that the somatosensory inputs affected perception of time-varying kinematic information rather than changes in target (movement end point) mouth shapes.
CONCLUSIONS
These findings suggest that orofacial somatosensory inputs associated with speech production prime premotor and somatosensory brain regions involved in the sensorimotor control of speech, thereby facilitating perception of concordant visible speech movements.
SUPPLEMENTAL MATERIAL
https://doi.org/10.23641/asha.9911846R01 DC002852 - NIDCD NIH HHSAccepted manuscrip
Development of audiovisual comprehension skills in prelingually deaf children with cochlear implants
Objective: The present study investigated the development of audiovisual comprehension skills in prelingually deaf children who received cochlear implants.
Design: We analyzed results obtained with the Common Phrases (Robbins et al., 1995) test of sentence comprehension from 80 prelingually deaf children with cochlear implants who were enrolled in a longitudinal study, from pre-implantation to 5 years after implantation.
Results: The results revealed that prelingually deaf children with cochlear implants performed better under audiovisual (AV) presentation compared with auditory-alone (A-alone) or visual-alone (V-alone) conditions. AV sentence comprehension skills were found to be strongly correlated with several clinical outcome measures of speech perception, speech intelligibility, and language. Finally, pre-implantation V-alone performance on the Common Phrases test was strongly correlated with 3-year postimplantation performance on clinical outcome measures of speech perception, speech intelligibility, and language skills.
Conclusions: The results suggest that lipreading skills and AV speech perception reflect a common source of variance associated with the development of phonological processing skills that is shared among a wide range of speech and language outcome measures
Speech perception abilities of adults with dyslexia: is there any evidence for a true deficit?
PURPOSE: This study investigated whether adults with dyslexia show evidence of a consistent speech perception deficit by testing phoneme categorization and word perception in noise. METHOD: Seventeen adults with dyslexia and 20 average readers underwent a test battery including standardized reading, language and phonological awareness tests, and tests of speech perception. Categorization of a pea/bee voicing contrast was evaluated using adaptive identification and discrimination tasks, presented in quiet and in noise, and a fixed-step discrimination task. Two further tests of word perception in noise were presented. RESULTS: There were no significant group differences for categorization in quiet or noise, across- and within-category discrimination as measured adaptively, or word perception, but average readers showed better across- and within-category discrimination in the fixed-step discrimination task. Individuals did not show consistent poor performance across related tasks. CONCLUSIONS: The small number of group differences, and lack of consistent poor individual performance, suggests weak support for a speech perception deficit in dyslexia. It seems likely that at least some poor performances are attributable to nonsensory factors like attention. It may also be that some individuals with dyslexia have speech perceptual acuity that is at the lower end of the normal range and exacerbated by nonsensory factors
Does training with amplitude modulated tones affect tone-vocoded speech perception?
Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored
- ā¦