46 research outputs found

    Automatic imitation of speech is enhanced for non-native sounds

    Simulation accounts of speech perception posit that speech is covertly imitated to support perception in a top-down manner. Behaviourally, covert imitation is measured through the stimulus-response compatibility (SRC) task. In each trial of a speech SRC task, participants produce a target speech sound whilst perceiving a speech distractor that either matches the target (compatible condition) or does not (incompatible condition). The degree to which the distractor is covertly imitated is captured by the automatic imitation effect, computed as the difference in response times (RTs) between compatible and incompatible trials. Simulation accounts disagree on whether covert imitation is enhanced when speech perception is challenging or instead when the speech signal is most familiar to the speaker. To test these accounts, we conducted three experiments in which participants completed SRC tasks with native and non-native sounds. Experiment 1 uncovered larger automatic imitation effects in an SRC task with non-native sounds than with native sounds. Experiment 2 replicated the finding online, demonstrating its robustness and the applicability of speech SRC tasks online. Experiment 3 intermixed native and non-native sounds within a single SRC task to disentangle effects of perceiving non-native sounds from confounding effects of producing non-native speech actions. This last experiment confirmed that automatic imitation is enhanced for non-native speech distractors, supporting a compensatory function of covert imitation in speech perception. The experiment also uncovered a separate effect of producing non-native speech actions on enhancing automatic imitation effects

    Percepción del contraste bilabial-labiodental en las consonantes aproximantes del castellano de Chile

    Until recently, the consensus was that labiodental realizations of Spanish /b/ did not exist, and that consequently this variation in place of articulation could be safely disregarded. However, new evidence emerged showing that labiodental variants of /b/ do exist in relatively high numbers, at least in some dialects such as in Chilean Spanish. This study set out to determine whether Chilean Spanish listeners are able to perceive the differences between bilabial and labiodental approximant variants of Spanish /b/ (i.e., [β̞] versus [ʋ]). In order to test this, natural and synthetic stimuli were presented to 31 native listeners in identification and discrimination tasks. Results showed that, while the identification task with natural stimuli provided mixed evidence of sensitivity to the contrast, the identification and discrimination tasks with synthetic stimuli provided no evidence of listeners perceiving the phonetic contrast categorically. In sum, listeners do no seem able to perceive the acoustic differences between the two segments, and thus it is unlikely that this phonetic contrast could be employed to encode sociolinguistic information.Hasta hace poco, el consenso en los precedentes investigativos era que las realizaciones labiodentales de /b/ no existían en el español, y que la variación de su punto de articulación podía ignorarse sin problemas. Sin embargo, evidencia reciente ha demostrado que variantes labiodentales existen y que son frecuentes, al menos en algunas variantes del castellano, como en el caso del castellano chileno. Este estudio se propone determinar si los oyentes del castellano chileno son capaces de percibir las diferencias entre realizaciones aproximantes bilabiales y labiodentales de /b/ (i.e., [β̞] versus [ʋ]). Para evaluar lo anterior, estímulos naturales y sintéticos de [β̞] y [ʋ] fueron preparados y presentados a 31 oyentes nativos en tareas de identificación y discriminación. Los resultados muestran que, mientras en la tarea de identificación con estímulos naturales la evidencia no es concluyente respecto de la existencia de sensibilidad ante el contraste, en las tareas de identificación y discriminación con estímulos sintéticos no existe evidencia que sugiera que los oyentes estén percibiendo el contraste auditivo categóricamente. En suma, los oyentes no parecen ser capaces de percibir las diferencias acústicas entre estos segmentos, y por lo tanto es improbable que el contraste esté siendo utilizado para codificar información sociolingüística

    Resilience of English vowel perception across regional accent variation

    In two categorization experiments using phonotactically legal nonce words, we tested Australian English listeners’ perception of all vowels in their own accent as well as in four less familiar regional varieties of English which differ in how their vowel realizations diverge from Australian English: London, Yorkshire, Newcastle (UK), and New Zealand. Results of Experiment 1 indicated that amongst the vowel differences described in sociophonetic studies and attested in our stimulus materials, only a small subset caused greater perceptual difficulty for Australian listeners than for the corresponding Australian English vowels. We discuss this perceptual tolerance for vowel variation in terms of how perceptual assimilation of phonetic details into abstract vowel categories may contribute to recognizing words across variable pronunciations. Experiment 2 determined whether short-term multi-talker exposure would facilitate accent adaptation, particularly for those vowels that proved more difficult to categorize in Experiment 1. For each accent separately, participants listened to a pre-test passage in the nonce word accent but told by novel talkers before completing the same task as in Experiment 1. In contrast to previous studies showing rapid adaptation to talker-specific variation, our listeners’ subsequent vowel assimilations were largely unaffected by exposure to other talkers’ accent-specific variation

    Revealing perceptual structure through input variation: cross-accent categorization of vowels in five accents of English

    This paper characterizes the perceptual structure of vowel systems in five regional accents of English, from Australia (A), New Zealand (Z), London (L), Yorkshire (Y), and Newcastle upon Tyne (N), on the basis of “whole system” vowel categorization experiments. We established patterns of within-accent vowel confusions, and then explored cross-accent perception, assessing how listeners from one accent background categorize vowels from another. Our experimental task required mapping continuous phonetic dimensions to perceptual categories in the absence of phonotactic and lexical cues to vowel identity and socio-indexical information about the talker. Our results show that, without these sources of information, there is uncertainty in vowel categorization, even for native accent vowels, and that this degree of uncertainty increases for unfamiliar accents. The patterns of cross-accent perception largely reflect the accent-specific perceptual structure of the listener, as opposed to adaptations to the stimulus accents. This finding contrasts with the type of active talker adaptation found with tasks offering lexical information about vowel identity and indexical information about the talker

    Phonetic variation and change in the Cockney Diaspora: The role of place, gender, and identity

    Recent research has suggested that two linguistic processes are displacing Cockney: the emergence of Multicultural London English (MLE) in inner London and dialect levelling (e.g. Kerswill & Williams 2005). This study investigates firstly whether Cockney phonetic features have ‘moved East’ to Essex (Fox 2015), and secondly the features’ indexicality in relation to place and identity. Fifty-four participants from Debden, an outpost of the Cockney Diaspora, completed a sociolinguistic interview. Vowel measurements were made from a wordlist and passage, and quantitative attitudinal and qualitative data were extracted from a questionnaire and interviews. Overall, changes in identity as a result of social change exceeded linguistic changes, and linguistic labels were not interpreted uniformly across the community. Whilst Cockney variants were largely maintained in young speakers, they were transposed onto an ‘Essex’ accent. Furthermore, some young women but no young men considered themselves Cockney, likely due to the matrifocal nature of Cockney. (Cockney, phonetic variation and change, dialect levelling, identity, indexicality, gende

    Vocal Tract Images Reveal Neural Representations of Sensorimotor Transformation During Speech Imitation

    Imitating speech necessitates the transformation from sensory targets to vocal tract motor output, yet little is known about the representational basis of this process in the human brain. Here, we address this question by using real-time MR imaging (rtMRI) of the vocal tract and functional MRI (fMRI) of the brain in a speech imitation paradigm. Participants trained on imitating a native vowel and a similar nonnative vowel that required lip rounding. Later, participants imitated these vowels and an untrained vowel pair during separate fMRI and rtMRI runs. Univariate fMRI analyses revealed that regions including left inferior frontal gyrus were more active during sensorimotor transformation (ST) and production of nonnative vowels, compared with native vowels; further, ST for nonnative vowels activated somatomotor cortex bilaterally, compared with ST of native vowels. Using test representational similarity analysis (RSA) models constructed from participants' vocal tract images and from stimulus formant distances, we found that RSA searchlight analyses of fMRI data showed either type of model could be represented in somatomotor, temporal, cerebellar, and hippocampal neural activation patterns during ST. We thus provide the first evidence of widespread and robust cortical and subcortical neural representation of vocal tract and/or formant parameters, during prearticulatory ST

    Functional brain outcomes of L2 speech learning emerge during sensorimotor transformation

    Sensorimotor transformation (ST) may be a critical process in mapping perceived speech input onto non-native (L2) phonemes, in support of subsequent speech production. Yet, little is known concerning the role of ST with respect to L2 speech, particularly where learned L2 phones (e.g., vowels) must be produced in more complex lexical contexts (e.g., multi-syllabic words). Here, we charted the behavioral and neural outcomes of producing trained L2 vowels at word level, using a speech imitation paradigm and functional MRI. We asked whether participants would be able to faithfully imitate trained L2 vowels when they occurred in non-words of varying complexity (one or three syllables). Moreover, we related individual differences in imitation success during training to BOLD activation during ST (i.e., pre-imitation listening), and during later imitation. We predicted that superior temporal and peri-Sylvian speech regions would show increased activation as a function of item complexity and non-nativeness of vowels, during ST. We further anticipated that pre-scan acoustic learning performance would predict BOLD activation for non-native (vs. native) speech during ST and imitation. We found individual differences in imitation success for training on the non-native vowel tokens in isolation; these were preserved in a subsequent task, during imitation of mono- and trisyllabic words containing those vowels. fMRI data revealed a widespread network involved in ST, modulated by both vowel nativeness and utterance complexity: superior temporal activation increased monotonically with complexity, showing greater activation for non-native than native vowels when presented in isolation and in trisyllables, but not in monosyllables. Individual differences analyses showed that learning versus lack of improvement on the non-native vowel during pre-scan training predicted increased ST activation for non-native compared with native items, at insular cortex, pre-SMA/SMA, and cerebellum. Our results hold implications for the importance of ST as a process underlying successful imitation of non-native speech

