345 research outputs found

    Lexical and Prosodic Pitch Modifications in Cantonese Infant-directed Speech

    Get PDF
    Published online 03 February 2021The functions of acoustic-phonetic modifications in infant-directed speech (IDS) remain a question: do they specifically serve to facilitate language learning via enhanced phonemic contrasts (the hyperarticulation hypothesis) or primarily to improve communication via prosodic exaggeration (the prosodic hypothesis)? The study of lexical tones provides a unique opportunity to shed light on this, as lexical tones are phonemically contrastive, yet their primary cue, pitch, is also a prosodic cue. This study investigated Cantonese IDS and found increased intra-talker variation of lexical tones, which more likely posed a challenge to rather than facilitated phonetic learning. Although tonal space was expanded which could facilitate phonetic learning, its expansion was a function of overall intonational modifications. Similar findings were observed in speech to pets who should not benefit from larger phonemic distinction. We conclude that lexicaltone adjustments in IDS mainly serve to broadly enhance communication rather than specifically increase phonemic contrast for learners.This work was supported by the University Grants Committee (HKSAR) (RGC34000118), the Innovation and Technology Fund (HKSAR) (ITS/067/18), Dr. Stanley Ho Medical Development Foundation, and the Global Parent Child Resource Centre Limited. The second author’s work is supported by the Basque Government through the BERC 2018-2021 program and by the Spanish Ministry of Science and Innovation through the Ramon y Cajal Research Fellowship, PID2019-105528GA-I00

    Children\u27s Sensitivity to Pitch Variation in Language

    Get PDF
    Children acquire consonant and vowel categories by 12 months, but take much longer to learn to interpret perceptible variation. This dissertation considers children’s interpretation of pitch variation. Pitch operates, often simultaneously, at different levels of linguistic structure. English-learning children must disregard pitch at the lexical level—since English is not a tone language—while still attending to pitch for its other functions. Chapters 1 and 5 outline the learning problem and suggest ways children might solve it. Chapter 2 demonstrates that 2.5-year-olds know pitch cannot differentiate words in English. Chapter 3 finds that not until age 4–5 do children correctly interpret pitch cues to emotions. Chapter 4 demonstrates some sensitivity between 2.5 and 5 years to the pitch cue to lexical stress, but continuing difficulties at the older ages. These findings suggest a late trajectory for interpretation of prosodic variation; throughout, I propose explanations for this protracted time-course

    From communicative functions to prosodic forms

    Get PDF
    This is a proposal in favour of proceeding from communicative function to linguistic form, rather than the reverse, for an insightful account of how humans communicate by speech in languages. A functional framework is developed that encompasses argumentation structures, declarative and interrogative functions, and expressive intensification. Such a function orientation can become a powerful tool in comparative prosodic research across the world's languages. The potential of this approach is shown by comparing the prosodic form of Mandarin Chinese data collected in functionally contextualized scenarios with corresponding data from English and German

    How tone, intonation and emotion shape the development of infants' fundamental frequency perception

    Get PDF
    Fundamental frequency (ƒ0), perceived as pitch, is the first and arguably most salient auditory component humans are exposed to since the beginning of life. It carries multiple linguistic (e.g., word meaning) and paralinguistic (e.g., speakers’ emotion) functions in speech and communication. The mappings between these functions and ƒ0 features vary within a language and differ cross-linguistically. For instance, a rising pitch can be perceived as a question in English but a lexical tone in Mandarin. Such variations mean that infants must learn the specific mappings based on their respective linguistic and social environments. To date, canonical theoretical frameworks and most empirical studies do not view or consider the multi-functionality of ƒ0, but typically focus on individual functions. More importantly, despite the eventual mastery of ƒ0 in communication, it is unclear how infants learn to decompose and recognize these overlapping functions carried by ƒ0. In this paper, we review the symbioses and synergies of the lexical, intonational, and emotional functions that can be carried by ƒ0 and are being acquired throughout infancy. On the basis of our review, we put forward the Learnability Hypothesis that infants decompose and acquire multiple ƒ0 functions through native/environmental experiences. Under this hypothesis, we propose representative cases such as the synergy scenario, where infants use visual cues to disambiguate and decompose the different ƒ0 functions. Further, viable ways to test the scenarios derived from this hypothesis are suggested across auditory and visual modalities. Discovering how infants learn to master the diverse functions carried by ƒ0 can increase our understanding of linguistic systems, auditory processing and communication functions

    Investigating the tonal system of Plastic Mandarin: a cross-varietal comparison

    Get PDF
    The city of Changsha, Hunan Province, China has seen an increase in the use of Mandarin in the past decade, overshadowing the local non-Mandarin variety, Changsha. A new variety “Plastic Mandarin”, mostly spoken by millennials and younger generations, has emerged. It is defined in this thesis as a non-standard Mandarin accent that features the speech of young urban residents in Changsha and that has crystallised over the past few decades. This thesis presents a detailed phonetic investigation of the tonal system of Plastic Mandarin through a cross-varietal comparative approach, mainly divided into two streams: citation tones and neutral tones in contexts. The defining characteristic of the citation tone system for Plastic Mandarin is established first: a mid-level tone, a low to mid rising tone, a low falling tone, and a high rising tone. By comparing the citation tones of the three varieties that coexist in the city of Changsha, the thesis provides acoustic evidence that Plastic Mandarin may arise when Mandarin tones adapt the pitch pattern of some corresponding Changsha tones. In addition to citation tones, this thesis disentangles the sources of variability in the syllable duration and f0 contour of speech sequences containing neutral tone syllables, i.e. those do not have any of the four canonical lexical tones and often overlooked in prior studies of tones. The data show that f0 contours converge at the end of two consecutive neutral tone syllables at a low pitch in both Mandarin varieties. It suggests that a neutral tone or a sequence of consecutive neutral tones tends to be associated with a low pitch target, despite the varying f0 shapes largely predicted by the preceding lexical tone. The thesis proposes a probabilistic target-approaching model for Mandarin tones in connected speech, in which pitch targets may be fewer than the number of syllables. While the phonetic realisation of the four lexical tones in Plastic Mandarin is consistently different from that in Standard Mandarin, the pitch target of neutral tone syllables tends to remain constant in this process of Mandarin variation and change, which may be attributed to the stable transfer of prosodic structure

    Universal and language-specific processing : the case of prosody

    Get PDF
    A key question in the science of language is how speech processing can be influenced by both language-universal and language-specific mechanisms (Cutler, Klein, & Levinson, 2005). My graduate research aimed to address this question by adopting a crosslanguage approach to compare languages with different phonological systems. Of all components of linguistic structure, prosody is often considered to be one of the most language-specific dimensions of speech. This can have significant implications for our understanding of language use, because much of speech processing is specifically tailored to the structure and requirements of the native language. However, it is still unclear whether prosody may also play a universal role across languages, and very little comparative attempts have been made to explore this possibility. In this thesis, I examined both the production and perception of prosodic cues to prominence and phrasing in native speakers of English and Mandarin Chinese. In focus production, our research revealed that English and Mandarin speakers were alike in how they used prosody to encode prominence, but there were also systematic language-specific differences in the exact degree to which they enhanced the different prosodic cues (Chapter 2). This, however, was not the case in focus perception, where English and Mandarin listeners were alike in the degree to which they used prosody to predict upcoming prominence, even though the precise cues in the preceding prosody could differ (Chapter 3). Further experiments examining prosodic focus prediction in the speech of different talkers have demonstrated functional cue equivalence in prosodic focus detection (Chapter 4). Likewise, our experiments have also revealed both crosslanguage similarities and differences in the production and perception of juncture cues (Chapter 5). Overall, prosodic processing is the result of a complex but subtle interplay of universal and language-specific structure

    Cracking the social code of speech prosody using reverse correlation

    Get PDF
    Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker's traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker's perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word "Hello," which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers' physical characteristics, such as sex and mean pitch. By characterizing how any given individual's mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals
    corecore