8 research outputs found

    Directions for the future of technology in pronunciation research and teaching

    Get PDF
    This paper reports on the role of technology in state-of-the-art pronunciation research and instruction, and makes concrete suggestions for future developments. The point of departure for this contribution is that the goal of second language (L2) pronunciation research and teaching should be enhanced comprehensibility and intelligibility as opposed to native-likeness. Three main areas are covered here. We begin with a presentation of advanced uses of pronunciation technology in research with a special focus on the expertise required to carry out even small-scale investigations. Next, we discuss the nature of data in pronunciation research, pointing to ways in which future work can build on advances in corpus research and crowdsourcing. Finally, we consider how these insights pave the way for researchers and developers working to create research-informed, computer-assisted pronunciation teaching resources. We conclude with predictions for future developments

    Prosodic analysis of foreign-accented English

    No full text
    This study compares utterances by Vietnamese learners of Australian English with those of native subjects. In a previous study the utterances had been rated for foreign accent and intelligibility. We aim to find measurable prosodic differences accounting for the perceptual results. Our outcomes indicate, inter alia, that unaccented syllables are relatively longer compared with accented ones in the Vietnamese corpus than those in the Australian English corpus. Furthermore, the correlations of syllabic durations in utterances of one and the same sentence are much higher for Australian English subjects than for Vietnamese learners of English. Vietnamese speakers use a larger range of f0 and produce more pitch-accents than Australian speakers

    Visual cues in Mandarin tone perception

    No full text
    This paper presents results concerning the exploitation of visual cues in the perception of Mandarin tones. The lower part of a female speaker's face was recorded on digital video as she uttered 25 sets of syllabic tokens covering the four different tones of Mandarin. Then in a perception study the audio sound track alone, as well an audio plus video condition were presented to native Mandarin speakers who were required to decide which tone they perceived. Audio was presented in various conditions: clear, babble-noise masked at different SNR levels, as well as devoiced and amplitude-modulated noise conditions using LPC resynthesis. In the devoiced and the clear audio conditions, there is little augmentation of audio alone due to the addition of video. However, the addition of visual information did significantly improve perception in the babble-noise masked condition, and this effect increased with decreasing SNR. This outcome suggests that the improvement in noise-masked conditions is not due to additional information in the video per se, but rather to an effect of early integration of acoustic and visual cues facilitating auditory-visual speech perception

    Map task dialogs in noise : a paradigm for examining Lombard speech

    No full text
    This paper presents a paradigm for comparing auditory-visual map task dialogs produced in silence and in noise, also known as Lombard speech. A previously developed temporal filtering algorithm which removes the ambient noise from recordings of Lombard speech by locating and subtracting a recording of the noise performed in the same environment was modified to accommodate longer recordings. The filtering algorithm yields overall noise attenuation between 15 and 35 dB without distorting the speech signal like spectral filtering approaches. On a small production dataset of two levels of vehicle and babble noise we examined the effect on fundamental frequency and intensity contours. We found that Lombard characteristics of speech, that is, an increase in mean F0 as well as intensity, are stronger for babble than for vehicle noise. There are indications that talkers become habituated to the noisy environment when they are exposed to it for the duration of a dialog. We did not find any consistency regarding the speed of completion of the map task, although participants appeared to solve the task more leisurely in silence than in noise. By performing eye-tracking on one of the talkers' data we found that the frequency of gaze was more than double in babble noise than in silence

    Are there facial correlates of Thai syllabic tones?

    No full text
    This paper deals with the influence of tones on syllabic articulation in Thai. Motion capturing of 24 facial points in the face of a female speaker was performed using an Optotrak system as she uttered 24 sets of syllabic tokens covering the five different tones of Thai 12 times each. After rigid and non-rigid movements had been separated, a PCA was conducted on the non-rigid data. In order to determine the influence of the tones on the facial movement the first PC reflecting the jaw opening was analyzed by aligning the derivatives of the first PCs with respect to the point of maximum velocity and averaging over all tokens of a syllable/tone combination. Analysis showed great similarities in the shapes of the resulting mean velocity contours. In some syllable sets, however, certain tones exhibited specific temporal alignments that were strongly correlated with the underlying syllable duration. This outcome suggests that certain syllable/tone combinations require a specific temporal alignment of articulatory and tonal gestures, though a consistent physiological explanation remains yet to be found

    Directions for the future of technology in pronunciation research and teaching

    Get PDF
    This paper reports on the role of technology in state-of-the-art pronunciation research and instruction, and makes concrete suggestions for future developments. The point of departure for this contribution is that the goal of second language (L2) pronunciation research and teaching should be enhanced comprehensibility and intelligibility as opposed to native-likeness. Three main areas are covered here. We begin with a presentation of advanced uses of pronunciation technology in research with a special focus on the expertise required to carry out even small-scale investigations. Next, we discuss the nature of data in pronunciation research, pointing to ways in which future work can build on advances in corpus research and crowdsourcing. Finally, we consider how these insights pave the way for researchers and developers working to create research-informed, computer-assisted pronunciation teaching resources. We conclude with predictions for future developments.This article is published as O'Brien, M.G., Derwing, T.M., Cucchiarini,C., Hardison, D.M., Mixdorff, H., Thomson, R.I., Strik, H., Levis, J.M., Munro, M.J., Foote, J.A., Levis, G.M.,; Directions for the future of technology in pronunciation research and teaching. Journal of Second Language Pronunciation 4:2 (2018), pp. 182–207. doi: 10.1075/jslp.17001.obr</p
    corecore