1,867 research outputs found

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Language-independent talker-specificity in first-language and second-language speech production by bilingual talkers: L1 speaking rate predicts L2 speaking rate

    Get PDF
    Second-language (L2) speech is consistently slower than first-language (L1) speech, and L1 speaking rate varies within- and across-talkers depending on many individual, situational, linguistic, and sociolinguistic factors. It is asked whether speaking rate is also determined by a language-independent talker-specific trait such that, across a group of bilinguals, L1 speaking rate significantly predicts L2 speaking rate. Two measurements of speaking rate were automatically extracted from recordings of read and spontaneous speech by English monolinguals (n = 27) and bilinguals from ten L1 backgrounds (n = 86): speech rate (syllables/second), and articulation rate (syllables/second excluding silent pauses). Replicating prior work, L2 speaking rates were significantly slower than L1 speaking rates both across-groups (monolinguals' L1 English vs bilinguals' L2 English), and across L1 and L2 within bilinguals. Critically, within the bilingual group, L1 speaking rate significantly predicted L2 speaking rate, suggesting that a significant portion of inter-talker variation in L2 speech is derived from inter-talker variation in L1 speech, and that individual variability in L2 spoken language production may be best understood within the context of individual variability in L1 spoken language production

    Generating segmental foreign accent

    Get PDF
    For most of us, speaking in a non-native language involves deviating to some extent from native pronunciation norms. However, the detailed basis for foreign accent (FA) remains elusive, in part due to methodological challenges in isolating segmental from suprasegmental factors. The current study examines the role of segmental features in conveying FA through the use of a generative approach in which accent is localised to single consonantal segments. Three techniques are evaluated: the first requires a highly-proficiency bilingual to produce words with isolated accented segments; the second uses cross-splicing of context-dependent consonants from the non-native language into native words; the third employs hidden Markov model synthesis to blend voice models for both languages. Using English and Spanish as the native/non-native languages respectively, listener cohorts from both languages identified words and rated their degree of FA. All techniques were capable of generating accented words, but to differing degrees. Naturally-produced speech led to the strongest FA ratings and synthetic speech the weakest, which we interpret as the outcome of over-smoothing. Nevertheless, the flexibility offered by synthesising localised accent encourages further development of the method

    Accent intelligibility across native and non-native accent pairings:investigating links with electrophysiological measures of word recognition

    Get PDF
    The intelligibility of accented speech in noise depends on the interaction of the accents of the talker and the listener. However, it is not yet clear how this influence arises. Accent familiarity is commonly proposed to be a major contributor to accent intelligibility, but recent evidence suggests that the similarity between talker and listener accents may also be able to account for accent intelligibility across talker-listener pairings. In addition, differences in accent intelligibility are also often only found in the presence of other adverse conditions, so it is not clear if the talker-listener pairing also influences speech processing in quiet conditions. This research had two main aims; to further investigate the relationship between accent similarity and intelligibility, and to use online EEG methods to explore the possible presence of talker-listener pairing related differences on speech perception in quiet conditions. English and Spanish listeners listened to Standard Southern British English (SSBE), Glaswegian English (GE) and Spanish-accented English (SpE) in a speech-in-noise recognition task, and also completed an event-related potential (ERP) task to elicit the PMN and N400 responses. Accent similarity was measured using the ACCDIST metric. Results showed the same (or extremely similar) patterns in accent intelligibility and accent similarity for both listener groups, giving further support to the hypothesis that accent similarity can contribute to the level of intelligibility of an accent within a talker-listener pairing. ERP data also suggest that speech processing in quiet is influenced by the talker-listener pairing. The PMN, which relates to phonological processing, seems particularly dependent on a match between talker and listener accent, but the more semantic N400 showed some flexibility in the ability to process accented speech

    Acoustic-Phonetic Characteristics of Clear Speech in Bilinguals

    Get PDF
    This study examined the language-dependency of clear speech modifications by comparing the clear speech strategies of late bilinguals in both their L1 (Finnish) and L2 (English). Results generally supported the hypothesis of language-independent enhancement of global clear speech modifications, but language-dependent segmental enhancement. The global clear speech strategies produced by Finnish-English bilinguals in their L2 (English) were similar in the extent of the modifications to those of native English speakers, indicating a surprising flexibility of the non-native speech production system

    Linguistic processing of accented speech across the lifespan.

    Get PDF
    In most of the world, people have regular exposure to multiple accents. Therefore, learning to quickly process accented speech is a prerequisite to successful communication. In this paper, we examine work on the perception of accented speech across the lifespan, from early infancy to late adulthood. Unfamiliar accents initially impair linguistic processing by infants, children, younger adults, and older adults, but listeners of all ages come to adapt to accented speech. Emergent research also goes beyond these perceptual abilities, by assessing links with production and the relative contributions of linguistic knowledge and general cognitive skills. We conclude by underlining points of convergence across ages, and the gaps left to face in future work

    Cognitive factors in perception and imitation of Thai tones by Mandarin versus Vietnamese speakers

    Get PDF
    The thesis investigates how native language phonological and phonetic factors affect non-native lexical tone perception and imitation, and how cognitive factors, such as memory load and stimulus variability (talker and vowel context variability), bias listeners to a phonological versus phonetic mode of perception/imitation. Two perceptual experiments and one imitation experiment were conducted with Thai tones as the stimuli and with Mandarin and Vietnamese listeners, who had no experience with Thai (i.e., naive listeners/imitators). The results of the perceptual experiments (Chapters 5 and 6) showed phonological effects as reflected in assimilation types (Categorised vs. UnCategorised assimilation) and phonetic effects indicated by percent choice and goodness ratings in tone assimilation, largely in line with predictions based on the Perceptual Assimilation Model (PAM: Best, 1995). In addition, phonological assimilation types and phonological overlap of the contrasts affected their discrimination in line with predictions based on PAM. The thesis research has revealed the influence of cognitive factors on native language influences in perception and imitation of non-native lexical tones, which contribute differently to different tasks. The findings carry implications for current non-native speech perception theories. The fact that non-native tone imitation deviations can be traced back to native phonological and phonetic influences on perception supports and provides new insights about perception-production links in processing non-native tones. The findings uphold the extrapolation of PAM and ASP principles to non-native tone perception and imitation, indicating that both native language phonological and phonetic influences and their modulation by cognitive factors hold implications for non-native speech perception/learning theories, as well as for second language instruction

    The early phase of /ɹ/ production development in adult Japanese learners of English

    Get PDF
    Although previous research indicates that Japanese speakers’ second-language (L2) perception and production of English /ɹ/ may improve with increased L2 experience, relatively little is known about the fine phonetic details of their /ɹ/ productions, especially during the early phase of L2 speech learning. This cross-sectional study examined acoustic properties of word-initial /ɹ/ from 60 Japanese learners with a length of residence (LOR) between one month and one year in Canada. Their performance was compared to that of 15 native speakers of English and 15 low-proficiency Japanese learners of English. Formant frequencies (F2 and F3) and F1 transition durations were evaluated under three task conditions—word reading, sentence reading, and timed picture description. Learners with as little as two to three months of residence demonstrated target-like F2 frequencies. In addition, increased LOR was predictive of more target-like transition durations. Although the learners showed some improvement in F3 as a function of LOR, they did so mainly at a controlled level of speech production. The findings suggest that during the early phase of L2 segmental development, production accuracy is task-dependent and is influenced by the availability of L1 phonetic cues for redeployment in L2

    The impact of regional accent variation on monolingual and bilingual infants’ lexical processing

    Get PDF
    Phonetic variation is inherent in natural speech. It can be lexically relevant, differentiating words, as well as lexically irrelevant indexical variation, which gives information about the talker or context, such as the gender, mood, regional or foreign accent. Efficient communication requires perceivers to discern how lexical versus indexical sources of variation affect the phonetic form of spoken words. While ample evidence is available on how children acquiring a single language handle variability in speech, less is known about how children simultaneously acquiring two languages deal with phonetic variation. This thesis investigates how the bilingual language environment affects children’s ability to accommodate accented speech. We consider three hypotheses. One is that bilingual infants may have an advantage relative to monolinguals due to their greater experience with phonetic variability across their two phonological systems. This is because the lexical representations in bilingual children, who have more experience with accent variation than monolingual children, might be more open to phonetic variation than monolinguals. Representations that are more open to variation might lead to higher flexibility in the word recognition of children with multi-accent input (bilinguals), resulting in accommodation benefits when processing an unfamiliar accent. An alternative hypothesis, however, is that bilingual children may have less stable lexical representations than monolinguals because their vocabulary size in each language is smaller. This could lead to processing costs in accent adaptation, resulting in accommodation disadvantages for bilinguals. The third and final hypothesis is that there would be no difference between bilinguals and their monolingual peers. This is because the effects of greater accent experience but less stable lexical representations in bilinguals may essentially neutralise each other, resulting in equivalent accent accommodation by bilinguals and monolinguals. To evaluate these hypotheses, three experiments were conducted with 17- and 25-month-old bilingual and monolingual children. Their ability to accommodate unfamiliar accented speech was analysed based on their language experience, pre-exposure to the unfamiliar accent, the type of phonetic variation (easy versus difficult phonetic change), and the cognitive demands of the experimental procedure. Taken together, the findings of Experiments 1-3 suggest that bilingual language input neither benefits nor hampers accent adaptation in bilingual children relative to monolingual children. The results carry implications for our current understanding of bilingualism and phonological development
    corecore