301 research outputs found

    Training Non-tonal Speakers in the Perception and Production of Mandarin Tones in Disyllabic Words

    Get PDF
    [[abstract]]This thesis aims to address the theoretical, methodological, and pedagogical issues of tonal acquisition in a second language (L2). The present study investigated the effects of three training approaches on the perceptual and production learning of the four Mandarin lexical tones by groups of non-tonal beginning learners. The experiment employed a pretest-posttest paradigm. Fifteen non-tonal learners of Mandarin Chinese in Taiwan received two weeks of training as extracurricular activities. Based on learners’ choices, one group (the A Group, n=5) received perceptual training only with auditory feedback involving four-way forced choice identification tasks with immediate feedback. A second group (the AM Group, n=5) received perceptual training with auditory and meaning-bearing feedback (i.e., corresponding pictures and English equivalents of the stimuli) involving the same identification tasks during training. A third group (the AV Group, n=5) received perceptual and production training with auditory and visual feedback showing pitch contours with which trainees can compare their own productions. The same training stimuli were used in the three training approaches. Following training, a posttest and a generalization test were administered immediately. Pretest, post-test, and generalization test data in perception and production were collected from the three groups and were compared for effectiveness of the three training procedures. Percent correct scores, perceptual sensitivities and production accuracy to each tone, and tonal confusions were also analyzed. The results at post-test showed that the three training groups improved significantly in perceptual accuracy of Mandarin tones as compared with a control group (the C Group, n=6) and perceptual learning also generalized to new stimuli by a new speaker. The three training groups’ production accuracy of Mandarin tones also improved significantly at posttest. More importantly, trainees who received the auditory-only feedback (i.e., the A and AM groups) showed a greater perceptual and productive improvement in identifying Mandarin lexical tones than those who were trained with the audio-visual feedback (i.e., the AV group). The results indicated that the three training approaches are effective and laboratory based training techniques can be implemented in extracurricular activities. These findings imply that the A and AM training approaches employed in the current study facilitate the learning of Mandarin tones and promote tonal modification of listeners’ tonal properties of L2 tones. It is also suggested that training only in perception with auditory-only feedback is sufficient for improvement in both perception and production of Mandarin tones.

    L2 Speech Learning: perception, production & training

    Get PDF
    Adult L2 learners have difficulties in perceiving and producing L2 speech sounds. In analyzing learners’ L2 speech learning problems, this study provides research data from a series of studies on L2 speech perception, production, and training. Section 1 investigates how the L1 sound system influences L2 speech perception. A recent study shows that phonetic differences and distances between English and Mandarin consonants predicted the perceptual problems of Mandarin consonants by native English learners of Chinese. Section 2 explores the relationship between L2 speech perception and production and reports a subsequent study on Mandarin consonants that shows English learners of Chinese performed better in perception than production on Mandarin retroflex sounds but vice versa on palatal sounds. The lack of alignment between perception and production suggests the relationship between L2 speech perception and production is not straightforward. In Section 3, two training experiments are reported and compared to explore the effects of phonetic training on the learning of English vowel and Mandarin tone contrasts

    How tone, intonation and emotion shape the development of infants' fundamental frequency perception

    Get PDF
    Fundamental frequency (ƒ0), perceived as pitch, is the first and arguably most salient auditory component humans are exposed to since the beginning of life. It carries multiple linguistic (e.g., word meaning) and paralinguistic (e.g., speakers’ emotion) functions in speech and communication. The mappings between these functions and ƒ0 features vary within a language and differ cross-linguistically. For instance, a rising pitch can be perceived as a question in English but a lexical tone in Mandarin. Such variations mean that infants must learn the specific mappings based on their respective linguistic and social environments. To date, canonical theoretical frameworks and most empirical studies do not view or consider the multi-functionality of ƒ0, but typically focus on individual functions. More importantly, despite the eventual mastery of ƒ0 in communication, it is unclear how infants learn to decompose and recognize these overlapping functions carried by ƒ0. In this paper, we review the symbioses and synergies of the lexical, intonational, and emotional functions that can be carried by ƒ0 and are being acquired throughout infancy. On the basis of our review, we put forward the Learnability Hypothesis that infants decompose and acquire multiple ƒ0 functions through native/environmental experiences. Under this hypothesis, we propose representative cases such as the synergy scenario, where infants use visual cues to disambiguate and decompose the different ƒ0 functions. Further, viable ways to test the scenarios derived from this hypothesis are suggested across auditory and visual modalities. Discovering how infants learn to master the diverse functions carried by ƒ0 can increase our understanding of linguistic systems, auditory processing and communication functions

    The Pitch Range of Italians and Americans. A Comparative Study

    Get PDF
    Linguistic experiments have investigated the nature of F0 span and level in cross-linguistic comparisons. However, only few studies have focused on the elaboration of a general-agreed methodology that may provide a unifying approach to the analysis of pitch range (Ladd, 1996; Patterson and Ladd, 1999; Daly and Warren, 2001; Bishop and Keating, 2010; Mennen et al. 2012). Pitch variation is used in different languages to convey different linguistic and paralinguistic meanings that may range from the expression of sentence modality to the marking of emotional and attitudinal nuances (Grice and Baumann, 2007). A number of factors have to be taken into consideration when determining the existence of measurable and reliable differences in pitch values. Daly and Warren (2001) demonstrated the importance of some independent variables such as language, age, body size, speaker sex (female vs. male), socio-cultural background, regional accents, speech task (read sentences vs. spontaneous dialogues), sentence type (questions vs. statements) and measure scales (Hertz, semitones, ERB etc.). Coherently with the model proposed by Mennen et al. (2012), my analysis of pitch range is based on the investigation of LTD (long-term distributional) and linguistic measures. LTD measures deal with the F0 distribution within a speaker’s contour (e.g. F0 minimum, F0 maximum, F0 mean, F0 median, standard deviation, F0 span) while linguistic measures are linked to specific targets within the contour, such as peaks and valleys (e.g. high and low landmarks) and preserve the temporal sequences of pitch contours. This investigation analyzed the characteristics of pitch range production and perception in English sentences uttered by Americans and Italians. Four experiments were conducted to examine different phenomena: i) the contrast between measures of F0 level and span in utterances produced by Americans and Italians (experiments 1-2); ii) the contrast between the pitch range produced by males and females in L1 and L2 (experiment 1); iii) the F0 patterns in different sentence types, that is, yes-no questions, wh-questions, and exclamations (experiment 2); iv) listeners’ evaluations of pitch span in terms of ±interesting, ±excited, ±credible, ±friendly ratings of different sentence types (experiments 3-4); v) the correlation between pitch span of the sentences and the evaluations given by American and Italian listeners (experiment 3); vi) the listeners’ evaluations of pitch span values in manipulated stimuli, whose F0 span was re-synthesized under three conditions: narrow span, original span, and wide span (experiment 4); vii) the different evaluations given to the sentences by male and female listeners. The results of this investigation supported the following generalizations. First, pitch span more than level was found to be a cue for non-nativeness, because L2 speakers of English used a narrower span, compared to the native norm. What is more, the experimental data in the production studies indicated that the mode of sentences was better captured by F0 span than level. Second, the Italian learners of English were influenced by their L1 and transferred L1 pitch range variation into their L2. The English sentences produced by the Italians had overall higher pitch levels and narrower pitch span than those produced by the Americans. In addition, the Italians used overall higher pitch levels when speaking Italian and lower levels when speaking English. Conversely, their pitch span was generally higher in English and lower in Italian. When comparing productions in English, the Italian females used higher F0 levels than the American females; vice versa, the Italian males showed slightly lower F0 levels than the American males. Third, there was a systematic relation between pitch span values and the listeners’ evaluations of the sentences. The two groups of listeners (the Americans and the Italians) rated the stimuli with larger pitch span as more interesting, exciting and credible than the stimuli with narrower pitch span. Thus, the listeners relied on the perceived pitch span to differentiate among the stimuli. Fourth, both the American and the Italian speakers were considered more friendly when the pitch span of their sentences was widened (wide span manipulation) and less friendly when the pitch span was narrowed (narrow span manipulation). This happened in all the stimuli regardless of the native language of the speakers (American vs. Italian)

    A study on form and function of prosody based on acoustics, interpretation, and modelling - with evidence from the analysis by synthesis of Mandarin speech prosody

    Get PDF
    An analysis-by-synthesis study on Mandarin speech prosody is conducted in the present dissertation. The features of Mandarin speech prosody are discussed by focusing on two salient aspects: the function of prosody and the form of prosody. The study attempts to find a plausible way in which the two aspects can be mapped onto each other through the functional analysis of prosody and the multi-level formal representation. The form of Mandarin speech prosody is a complex F0 picture due to the simultaneous uses of pitch contours by both lexical tones and sentential intonation. The phenomenon of tone sandhi in speech context triggers more puzzling issues when researchers are confronted with the acoustic form of Mandarin prosody. The functional use of prosody in Mandarin speech concerns: at the lexical level for word identity (Tone1, Tone2, Tone3, Tone4, and Tone0); at the sentential level for prominence marking (sentence accents) and the indication of prosodic boundaries (intonation boundary tones). In the present study, the analysis of prosodic function at the two levels provides a basic framework in coding the surface melodic form of Mandarin prosody, which consists of pitch contours in tonal units and boundary tones at the beginning and end of intonation unit. For the formal representation of Mandarin speech prosody, the surface F0 contour of each utterance is coded into a sequence of INTSINT symbols, and subject to the Prozed tool for speech synthesis. It is shown that the synthesized stimuli derived from the symbolic coding can closely follow the melodic features and correctly express the prosodic function of the original Mandarin utterances. The present study employs acoustic data, symbolic coding, and speech synthesis for the derivative mapping between prosodic function and form, which aims to interpret the complex prosodic phenomenon, and provide an insight for the annotation and analysis of Mandarin speech prosody

    A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

    Get PDF
    The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

    Cross-Linguistic Perception and Learning of Mandarin Chinese Sounds by Japanese Adult Learners

    Get PDF
    This dissertation presents a cross-linguistic investigation of how nonnative sounds are perceived by second language (L2) learners in terms of their first language (L1) categories for an understudies language pair---Japanese and Mandarin Chinese. Category mapping experiment empirically measured the perceived phonetic distances between Chinese sounds and their most resembling Japanese categories, which generated testable predictions on discriminability of Chinese sound contrasts according to Perception Assimilation Model (PAM). Category discrimination experiment obtained data concerning L2 learners' actual performance on discrimination Chinese sounds. The discrepancy between PAM's predictions and actual performances revealed that PAM cannot be applied to L2 perceptual learning. It was suggested that the discriminability of L2 sound contrasts was not only determined by perceived phonetic distances but probably involved other factors, such as the distinctiveness of certain phonetic features, e.g. aspiration and retroflexion. The training experiment assessed the improvement of L2 learners' performance in identifying Chinese sound contrasts with exposure to high variability stimuli and feedback. The results not only proved the effectiveness of training in shaping L2 learners' perception but showed that the training effects were generalizable to new tokens spoken by unfamiliar talkers. In addition to perception, the production of Chinese sounds by Japanese learners was also examined from the phonetic perspective in terms of perceived foreign accentedness. Regression of L2 learners' and native speakers foreign accentedness ratings against acoustic measurements of their speech production revealed that although both segmental and suprasegmental variables contributed to the perception of foreign accent, suprasegmental variables such as total and intonation patterns were the most influential factor in predicting perceived foreign accent. To conclude, PAM failed to accurately predict learning difficulties of nonnative sounds faced by L2 learners solely based on perceived phonetic distances. As Speech Learning Model (SLM) hypothesizes, production was found to be driven by perception, since equivalence classification of L2 sounds to L1 categories prevented the establishment of a new phonological category, thus further resulted in divergence in L2 production. Although production was hypothesized to eventually resemble perception, asynchrony between production and perception was observed due to different mechanisms involved
    • …
    corecore