591 research outputs found

    A criterial interlocutor tally for successful talker adaptation?

    No full text
    Part of the remarkable efficiency of listening is accommodation to unfamiliar talkers’ specific pronunciations by retuning of phonemic intercategory boundaries. Such retuning occurs in second (L2) as well as first language (L1); however, recent research with emigrés revealed successful adaptation in the environmental L2 but, unprecedentedly, not in L1 despite continuing L1 use. A possible explanation involving relative exposure to novel talkers is here tested in heritage language users with Mandarin as family L1 and English as environmental language. In English, exposure to an ambiguous sound in disambiguating word contexts prompted the expected adjustment of phonemic boundaries in subsequent categorisation. However, no adjustment occurred in Mandarin, again despite regular use. Participants reported highly asymmetric interlocutor counts in the two languages. We conclude that successful retuning ability requires regular exposure to novel talkers in the language in question, a criterion not met for the emigrés’ or for these heritage users’ L1

    No L1 privilege in talker adaptation

    No full text
    As a rule, listening is easier in first (L1) than second languages (L2); difficult L2 listening can challenge even highly proficient users. We here examine one particular listening function, adaptation to novel talkers, in such a high-proficiency population: Dutch emigrants to Australia, predominantly using English outside the family, but all also retaining L1 proficiency. Using lexically-guided perceptual learning (Norris, McQueen & Cutler, 2003), we investigated these listeners’ adaptation to an ambiguous speech sound, in parallel experiments in both their L1 and their L2. A control study established that perceptual learning outcomes were unaffected by the procedural measures required for this double comparison. The emigrants showed equivalent proficiency in tests in both languages, robust perceptual adaptation in their L2, English, but no adaptation in L1. We propose that adaptation to novel talkers is a language-specific skill requiring regular novel practice; a limited set of known (family) interlocutors cannot meet this requirement

    Auditory perceptual learning in autistic adults

    Get PDF
    The automatic retuning of phoneme categories to better adapt to the speech of a novel talker has been extensively documented across various (neurotypical) populations, including both adults and children. However, no studies have examined auditory perceptual learning effects in populations atypical in perceptual, social, and language processing for communication, such as populations with autism. Employing a classic lexically-guided perceptual learning paradigm, the present study investigated perceptual learning effects in Australian English autistic and non-autistic adults. The findings revealed that automatic attunement to existing phoneme categories was not activated in the autistic group in the same manner as for non-autistic control subjects. Specifically, autistic adults were able to both successfully discern lexical items and to categorize speech sounds; however, they did not show effects of perceptual retuning to talkers. These findings may have implications for the application of current sensory theories (e.g., Bayesian decision theory) to speech and language processing by autistic individuals. Lay Summary Lexically guided perceptual learning assists in the disambiguation of speech from a novel talker. The present study established that while Australian English autistic adult listeners were able to successfully discern lexical items and categorize speech sounds in their native language, perceptual flexibility in updating speaker-specific phonemic knowledge when exposed to a novel talker was not available. Implications for speech and language processing by autistic individuals as well as current sensory theories are discussed

    The Relationship Between Phonemic Category Boundary Changes and Perceptual Adjustments to Natural Accents

    Get PDF
    published Online First October 21, 2019People often experience difficulties when they first hear a novel accent. Prior research has shown that relatively fast natural accent accommodation can occur. However, there has been little investigation of the underlying perceptual mechanism that drives the learning. The current study examines whether phonemic boundary changes play a central role in natural accent accommodation. Two well-established boundary shifting phenomena were used here—recalibration and selective adaptation—to index the flexibility of phonemic category boundaries. Natural accent accommodation was measured with a task in which listeners heard accented words and nonwords before and after listening to English sentences produced by one of two native Mandarin Chinese speakers with moderate accents. In two experiments, participants completed recalibration, selective adaptation, and natural accent accommodation tasks focusing on a consonant contrast that is difficult for native Chinese speakers to produce. We found that: (a) On the accent accommodation task, participants showed an increased endorsement of accented/ mispronounced words after exposure to a speaker’s accented speech, indicating a potential relaxation of criteria in the word recognition process; (b) There was no strong link between recalibrating phonemic boundaries and natural accent accommodation; (c) There was no significant correlation between recalibration and selective adaptation. These results suggest that recalibration of phonemic boundaries does not play a central role in natural accent accommodation. Instead, there is some evidence suggesting that natural accent accommodation involves a relaxation of phonemic categorization criteria.Support was provided by Ministerio de Ciencia E Innovacion, Grant PSI2017-82563-P, Centro de Excelencia Severo Ochoa, Grant SEV-2015- 0490, by the Basque Government through the BERC 2018–2021 program, and by the National Science Foundation under Grant IBSS-1519908

    Communicative focus on form and second language suprasegmental learning: teaching Cantonese learners to perceive mandarin tones

    Get PDF
    The current study examined how form-focused instruction (FFI) with and without corrective feedback (CF) as output enhancement can facilitate L2 perception of Mandarin tones at both the phonetic and phonological levels in 41 Cantonese learners of Mandarin. Two experimental groups, FFI-only and FFI-CF, received a 90-minute FFI treatment designed to encourage them to notice and practice the categorical distinctions of Mandarin tones through a range of communicative input and output activities. During these activities, the instructors provided CF only to students in the FFI-CF group by recasting and pushing them to repair their mispronunciations of the target features (i.e., output enhancement). The control group received comparable meaning-oriented instruction without any FFI. The effectiveness of FFI was assessed via a forced-choice identification task with both trained and untrained items for a variety of tonal contrasts in Mandarin (high level Tone 1 vs. mid-rising Tone 2 vs. high falling Tone 4). According to statistical comparisons, the FFI-only group attained significant improvement in all lexical and tonal contexts, and such effectiveness was evident particularly in the acquisition of Tone 1 and Tone 4—supposedly the most difficult instances due to their identical phonological status in the learners’ L1, Cantonese. The FFI-CF group, however, demonstrated marginally significant gains only under the trained lexical conditions. The results in turn suggest that FFI promotes learners’ attentional shift from vocabulary to sound learning (generalizable gains in trained and untrained items) and facilitates their access to new phonetic and phonological categories. Yet, the relative advantage of adding CF to FFI as output enhancement remains unclear, especially with respect to the less experienced L2 learners in the current study

    Modeling DNN as human learner

    Get PDF
    In previous experiments, human listeners demonstrated that they had the ability to adapt to unheard, ambiguous phonemes after some initial, relatively short exposures. At the same time, previous work in the speech community has shown that pre-trained deep neural network-based (DNN) ASR systems, like humans, also have the ability to adapt to unseen, ambiguous phonemes after retuning their parameters on a relatively small set. In the first part of this thesis, the time-course of phoneme category adaptation in a DNN is investigated in more detail. By retuning the DNNs with more and more tokens with ambiguous sounds and comparing classification accuracy of the ambiguous phonemes in a held-out test across the time-course, we found out that DNNs, like human listeners, also demonstrated fast adaptation: the accuracy curves were step-like in almost all cases, showing very little adaptation after seeing only one (out of ten) training bins. However, unlike our experimental setup mentioned above, in a typical lexically guided perceptual learning experiment, listeners are trained with individual words instead of individual phones, and thus to truly model such a scenario, we would require a model that could take the context of a whole utterance into account. Traditional speech recognition systems accomplish this through the use of hidden Markov models (HMM) and WFST decoding. In recent years, bidirectional long short-term memory (Bi-LSTM) trained under connectionist temporal classification (CTC) criterion has also attracted much attention. In the second part of this thesis, previous experiments on ambiguous phoneme recognition were carried out again on a new Bi-LSTM model, and phonetic transcriptions of words ending with ambiguous phonemes were used as training targets, instead of individual sounds that consisted of a single phoneme. We found out that despite the vastly different architecture, the new model showed highly similar behavior in terms of classification rate over the time course of incremental retuning. This indicated that ambiguous phonemes in a continuous context could also be quickly adapted by neural network-based models. In the last part of this thesis, our pre-trained Dutch Bi-LSTM from the previous part was treated as a Dutch second language learner and was asked to transcribe English utterances in a self-adaptation scheme. In other words, we used the Dutch model to generate phonetic transcriptions directly and retune the model on the transcriptions it generated, although ground truth transcriptions were used to choose a subset of all self-labeled transcriptions. Self-adaptation is of interest as a model of human second language learning, but also has great practical engineering value, e.g., it could be used to adapt speech recognition to a lowr-resource language. We investigated two ways to improve the adaptation scheme, with the first being multi-task learning with articulatory feature detection during training the model on Dutch and self-labeled adaptation, and the second being first letting the model adapt to isolated short words before feeding it with longer utterances.Ope

    Phonological abstraction in processing lexical-tone variation: Evidence from a learning paradigm

    Get PDF
    There is a growing consensus that the mental lexicon contains both abstract and word-specific acoustic information. To investigate their relative importance for word recognition, we tested to what extent perceptual learning is word specific or generalizable to other words. In an exposure phase, participants were divided into two groups; each group was semantically biased to interpret an ambiguous Mandarin tone contour as either tone1 or tone2. In a subsequent test phase, the perception of ambiguous contours was dependent on the exposure phase: Participants who heard ambiguous contours as tone1 during exposure were more likely to perceive ambiguous contours as tone1 than participants who heard ambiguous contours as tone2 during exposure. This learning effect was only slightly larger for previously encountered than for not previously encountered words. The results speak for an architecture with prelexical analysis of phonological categories to achieve both lexical access and episodic storage of exemplars
    • …
    corecore