269 research outputs found

    Prosodic challenges faced by English speakers reading Mandarin

    Get PDF
    This study compares the prosodic characteristics of L2-Mandarin as spoken by L1-English speakers using L1-Mandarin utterances. The acoustic correlates examined include individual tonal realizations, interactions of tones in sequence, durational features and intensity envelopes. L2-Mandarin users realize the contour tones RISE and FALL with both rising and falling pitch, and produce the second tone of disyllabic words with more varied pitch. L2-users employ larger vowel durations, syllable durations and larger variation over vowel intervals in sequential pairs than L1-Mandarin users. Both user groups show similar intensity envelopes. Implications of this study include tailoring language training programs that counterbalance L1 influences

    The Acoustic Correlates of Stress-Shifting Suffixes in Native and Nonnative English

    Get PDF
    Although laboratory phonology techniques have been widely employed to discover the interplay between the acoustic correlates of English Lexical Stress (ELS)–fundamental frequency, duration, and intensity - studies on ELS in polysyllabic words are rare, and cross-linguistic acoustic studies in this area are even rarer. Consequently, the effects of language experience on L2 lexical stress acquisition are not clear. This investigation of adult Arabic (Saudi Arabian) and Mandarin (Mainland Chinese) speakers analyzes their ELS production in tokens with seven different stress-shifting suffixes; i.e., Level 1 [+cyclic] derivations to phonologists. Stress productions are then systematically analyzed and compared with those of speakers of Midwest American English using the acoustic phonetic software, Praat. In total, one hundred subjects participated in the study, spread evenly across the three language groups, and 2,125 vowels in 800 spectrograms were analyzed (excluding stress placement and pronunciation errors). Nonnative speakers completed a sociometric survey prior to recording so that statistical sampling techniques could be used to evaluate acquisition of accurate ELS production. The speech samples of native speakers were analyzed to provide norm values for cross-reference and to provide insights into the proposed Salience Hierarchy of the Acoustic Correlates of Stress (SHACS). The results support the notion that a SHACS does exist in the L1 sound system, and that native-like command of this system through accurate ELS production can be acquired by proficient L2 learners via increased L2 input. Other findings raise questions as to the accuracy of standard American English dictionary pronunciations as well as the generalizability of claims made about the acoustic properties of tonic accent shift

    Phonetic complexity affects children’s Mandarin tone production accuracy in disyllabic words: A perceptual study

    Get PDF
    This is the first study to examine the effect of phonetic contexts on children’s lexical tone production. Mandarin tones in disyllabic words produced by forty-four 2- to 6-year-old children and twelve mothers were low-pass filtered to eliminate lexical information. Native Mandarin-speaking adults categorized the tones based on the pitch information in the filtered stimuli. All mothers’ tones were categorized with ceiling accuracy. Counter to the findings in most previous studies on children’s tone acquisition and the prevailing assumption in models of speech development that children acquire suprasegmental features much earlier than segmental features, this study found that children as old as six years of age have not mastered the production of Mandarin tones. Children’s tones were judged with significantly lower accuracy than mothers’ productions. Tone accuracy improved, while cross subject variability in tone accuracy decreased, with age. Children’s tone accuracy was affected by the articulatory complexity of phonetic contexts. Children made more errors in tone combinations with more complex fundamental frequency (F0) contours than tone sequences with simpler F0 changes. When producing disyllabic tone sequences with complex F0 contours, children tended to shift the F0 contour of the first tone to reduce the F0 change, resulting in more tone errors in the first syllable than in the second syllable and showing substantially more anticipatory coarticulation than adults. The results provide further evidence that acquisition of lexical tones is a protracted process in children. Tones produced accurately by children in one phonetic context may not be produced correctly in another phonetic context. Children demonstrate more anticipatory coarticulation in their disyllabic productions than adults, which may be attributed to children’s immature speech motor control in tone production, and is presumably a by-product of their inability to accomplish complex F0 changes within the syllable time-frame.published_or_final_versio

    The invalidity of rhythm class hypothesis

    Get PDF
    Languages are said to be stress-timed, syllable-timed or mora-timed. In a stress-timed language, inter-stress intervals are or tend to be constant, hence, isochronous, while in a syllable-timed or mora-timed language, successive syllables or morae are or tent to be equal in duration. Empirical research has failed to find evidence of isochrony in any language, yet the hypothesis is now sustained by perception accounts or phonetic metrics that do not measure isochrony. We have re-examined the rhythm class hypothesis by looking for evidence of at least a tendency toward isochrony, through a comparison of English, an alleged stress-timed language, and Mandarin, an alleged syllable-timed language. The results show that in English, segments are not compressible to allow equal syllable duration, and syllables are incompressible to enable equal inter-stress interval duration and phrase duration. In contrast, Mandarin shows a small tendency toward both equal syllable duration and equal phrase duration. These findings are exactly the opposite of what would be predicted by the rhythm class hypothesis. We therefore argue that the hypothesis is not just flawed, but simply untenable, and the so-called rhythm classes should no longer be held as a basic fact of human language

    Effects of part of speech: Primitive or derived from word frequency?

    Get PDF
    Part of speech (POS hereafter) is known to affect both duration and F0, such that, nouns are longer and higher in F0 than verbs. In this study we tested the hypothesis that the POS effects are actually a word frequency effect, and that this effect is predictable from information theory. We tested this hypothesis by comparing 44 phonologically matched noun-verb pairs in Mandarin. Results show that there were clear effects of word frequency on duration, but no effects on F0. In contrast, no effects of POS were found on either duration or F0. We conclude that there are no primitive POS effects on duration or F0, but the frequency effect on duration may lead to a weak POS effect given sufficient corpus size

    Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization

    Full text link
    Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Pre-Low Raising in Japanese Pitch Accent

    Get PDF
    Japanese has been observed to have 2 versions of the H tone, the higher of which is associated with an accented mora. However, the distinction of these 2 versions only surfaces in context but not in isolation, leading to a long-standing debate over whether there is 1 H tone or 2. This article reports evidence that the higher version may result from a pre-low raising mechanism rather than being inherently higher. The evidence is based on an analysis of F0 of words that varied in length, accent condition and syllable structure, produced by native speakers of Japanese at 2 speech rates. The data indicate a clear separation between effects that are due to mora-level preplanning and those that are mechanical. These results are discussed in terms of mechanisms of laryngeal control during tone production, and highlight the importance of articulation as a link between phonology and surface acoustics.postprin

    Analyzing Prosody with Legendre Polynomial Coefficients

    Full text link
    This investigation demonstrates the effectiveness of Legendre polynomial coefficients representing prosodic contours within the context of two different tasks: nativeness classification and sarcasm detection. By making use of accurate representations of prosodic contours to answer fundamental linguistic questions, we contribute significantly to the body of research focused on analyzing prosody in linguistics as well as modeling prosody for machine learning tasks. Using Legendre polynomial coefficient representations of prosodic contours, we answer prosodic questions about differences in prosody between native English speakers and non-native English speakers whose first language is Mandarin. We also learn more about prosodic qualities of sarcastic speech. We additionally perform machine learning classification for both tasks, (achieving an accuracy of 72.3% for nativeness classification, and achieving 81.57% for sarcasm detection). We recommend that linguists looking to analyze prosodic contours make use of Legendre polynomial coefficients modeling; the accuracy and quality of the resulting prosodic contour representations makes them highly interpretable for linguistic analysis

    Strategies for analyzing tone languages

    Get PDF
    This paper outlines a method of auditory and acoustic analysis for determining the tonemes of a language starting from scratch, drawing on the author’s experience of recording and analyzing tone languages of north-east India. The methodology is applied to a preliminary analysis of tone in the Thang dialect of Khiamniungan, a virtually undocumented language of extreme eastern Nagaland and adjacent areas of the Sagaing Division Myanmar (Burma). Following a discussion of strategies for ensuring that data appropriate for tonal analysis will be recorded, the practical demonstration begins with a description of how tone categories can be established according to their syllable type in the preliminary auditory analysis. The paper then uses this data to describe a method of acoustic analysis that ultimately permits the representation of pitch shapes as a function of absolute mean duration. The analysis of grammatical tones, floating tones and tone sandhi are exemplified with Mongsen Ao data, and a description of a perception test demonstrates how this can be used to corroborate the auditory and acoustic analysis of a tone system. *This paper is in the series How to Study a Tone Language, edited by Steven Bird and Larry HymanNational Foreign Language Resource Cente
    • 

    corecore