Search CORE

185 research outputs found

Automatic prosodic analysis for computer aided pronunciation teaching

Author: Bagshaw Paul Christopher
Publication venue: The University of Edinburgh
Publication date: 01/01/1994
Field of study

Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech..

CiteSeerX

Edinburgh Research Archive

On the Use of Wavelets and Cepstrum Excitation for Pitch Determination in Real-Time

Author: Bahja Fadoua
Di Martino Joseph
Ibn Elhaj El Hassan
Publication venue: HAL CCSD
Publication date: 10/05/2012
Field of study

International audienceIn the current paper, we propose a new pitch tracking technique based on a wavelet transform in the temporal domain. Our algorithm is designed to determine the pitch frequency of the speech signal using a simple voicing decision algorithm. The pitch period is extracted from the cepstrum excitation signal processed by a wavelet transform; then the pitch contour is refined by thresholding and correction algorithms without any post-processing. The results obtained show that the proposed algorithm provides very good pitch contours compared to those furnished by the Bagshaw database

INRIA a CCSD electronic archive server

Hal-Diderot

Using Deep Neural Networks for Smoothing Pitch Profiles in Connected Speech

Author: Ferro Michele
Tamburini Fabio
Publication venue: 'OpenEdition'
Publication date: 15/12/2020
Field of study

This paper presents a new pitch tracking smoother based on deep neural networks (DNN). It leverages Long Short-Term Memories, a particular kind of recurrent neural network, for correcting pitch detection errors produced by state-of-the-art Pitch Detection Algorithms. The proposed system has been extensively tested using two reference benchmarks for English and exhibited very good performances in correcting pitch detection algorithms outputs when compared with the gold standard obtained with laryngographs

OpenEdition

A corroborative study on improving pitch determination by time–frequency cepstrum decomposition using wavelets

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

Proceedings of the Sixteenth Australasian International Conference on Speech Science and Technology

Author
Publication venue: ASSTA
Publication date: 31/12/2016
Field of study

UCL Discovery

Exploring Speech Technologies for Language Learning

Author: DELMONTE R.
Publication venue: 'IntechOpen'
Publication date: 01/01/2011
Field of study

The teaching of the pronunciation of any foreign language must encompass both segmental and suprasegmental aspects of speech. In computational terms, the two levels of language learning activities can be decomposed at least into phonemic aspects, which include the correct pronunciation of single phonemes and the co-articulation of phonemes into higher phonological units; as well as prosodic aspects which include  the correct position of stress at word level;  the alternation of stress and unstressed syllables in terms of compensation and vowel reduction;  the correct position of sentence accent;  the generation of the adequate rhymth from the interleaving of stress, accent, and phonological rules;  the generation of adequate intonational pattern for each utterance related to communicative functions; As appears from above, for a student to communicate intelligibly and as close as possible to native-speaker's pronunciation, prosody is very important [3]. We also assume that an incorrect prosody may hamper communication from taking place and this may be regarded a strong motivation for having the teaching of Prosody as an integral part of any language course. From our point of view it is much more important to stress the achievement of successful communication as the main objective of a second language learner rather than the overcoming of what has been termed “foreign accent”, which can be deemed as a secondary goal. In any case, the two goals are certainly not coincident even though they may be overlapping in some cases. We will discuss about these matter in the following sections. All prosodic questions related to “rhythm” will be discussed in the first section of this chapter. In [4] the author argues in favour of prosodic aids, in particular because a strong placement of word stress may impair understanding from the listener’s point of view of the word being pronounced. He also argues in favour of acquiring correct timing of phonological units to overcome the impression of “foreign accent” which may ensue from an incorrect distribution of stressed vs. unstressed stretches of linguistic units such as syllables or metric feet. Timing is not to be confused with speaking rate which need not be increased forcefully to give the impression of a good fluency: trying to increase speaking rate may result in lower intelligibility. The question of “foreign accent” is also discussed at length in (Jilka M., 1999). This work is particularly relevant as far as intonational features of a learner of a second language which we will address in the second section of this chapter. Correcting the Intonational Foreign Accent (hence IFA) is an important component of a Prosodic Module for self-learning activities, as categorical aspects of the intonation of the two languages in contact, L1 and L2 are far apart and thus neatly distinguishable. Choice of the two languages in contact is determined mainly by the fact that the distance in prosodic terms between English and Italian is maximal, according to (Ramus, F. and J. Mehler, 1999; Ramus F., et al., 1999)

Archivio Ricerca Ca'Foscari

IntechOpen

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Prosodic tools for language learning

Author: A. Batliner
D. Klatt
D. M. Chun
D. Pisoni
D. S. Hurley
E. Shriberg
F. Ramus
F. Ramus
I. Lehiste
J. Bernstein
J. D. Bowen
L. Loveday
M. Eskénazi
M. J. Luthy
N. Umeda
O. R. Kelm
P. M. Bertinetto
R. Delmonte
R. Delmonte
R. Delmonte
Rodolfo Delmonte
S. Hiller
W. Campbell
Y. Medan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In this paper we will be concerned with the role played by prosody in language learning and by the speech technology already available as commercial product or as prototype, capable to cope with the task of helping language learner in improving their knowledge of a second language from the prosodic point of view. The paper has been divided into two separate sections: Section One, dealing with Rhythm and all related topics; Section Two dealing with Intonation. In the Introduction we will argue that the use of ASR (Automatic Speech Recognition) as Teaching Aid should be under-utilized and should be targeted to narrowly focussed spoken exercises, disallowing open-ended dialogues, in order to ensure consistency of evaluation. Eventually, we will support the conjoined use of ASR technology and prosodic tools to produce GOP useable for linguistically consistent and adequate feedback to the student. This will be illustrated by presenting State of the Art for both sections, with systems well documented in the scientific literature of the respective field. In order to discuss the scientific foundations of prosodic analysis we will present data related to English and Italian and make comparisons to clarify the issues at hand. In this context, we will also present the Prosodic Module of a courseware for computer-assisted foreign language learning called SLIM—an acronym for Multimedia Interactive Linguistic Software, developed at the University of Venice (Delmonte et al. in Convegno GFS-AIA, pp. 47–58, 1996a; Ed-Media 96, AACE, pp. 326–333, 1996b). The Prosodic Module has been created in order to deal with the problem of improving a student’s performance both in the perception and production of prosodic aspects of spoken language activities. It is composed of two different sets of Learning Activities, the first one dealing with phonetic and prosodic problems at word level and at syllable level; the second one dealing with prosodic aspects at phonological phrase and utterance suprasegmental level. The main goal of Prosodic Activities is to ensure consistent and pedagogically sound feedback to the student intending to improve his/her pronunciation in a foreign language

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Directions for the future of technology in pronunciation research and teaching

Author: Cucchiarini Catia
Derwing Tracey M.
Foote Jennifer A.
Hardison Debra M.
Levis Greta M.
Levis John M.
Mixdorff Hansjorg
Munro Murray J.
O\u27Brien Mary G.
Strik Helmer
Thomson Ron I.
Publication venue: Iowa State University Digital Repository
Publication date: 01/02/2019
Field of study

This paper reports on the role of technology in state-of-the-art pronunciation research and instruction, and makes concrete suggestions for future developments. The point of departure for this contribution is that the goal of second language (L2) pronunciation research and teaching should be enhanced comprehensibility and intelligibility as opposed to native-likeness. Three main areas are covered here. We begin with a presentation of advanced uses of pronunciation technology in research with a special focus on the expertise required to carry out even small-scale investigations. Next, we discuss the nature of data in pronunciation research, pointing to ways in which future work can build on advances in corpus research and crowdsourcing. Finally, we consider how these insights pave the way for researchers and developers working to create research-informed, computer-assisted pronunciation teaching resources. We conclude with predictions for future developments

Digital Repository @ Iowa State University (ISU)

Beszéd alapfrekvencia követés hatékony zöngésség detektálással

Author: Bárdi Tamás
Publication venue
Publication date: 01/01/2004
Field of study

A beszédjel alapfrekvenciát meghatározó algoritmusok, más néven pitch detektorok helyes működése csak úgy lehetséges, ha az automatikus zöngés-zöngétlen megkülönböztetés is megbízható. Az alábbiakban ismertetjük pitch detektorunkat, melyben a zöngésség detektálása a konkurens módszereknél alacsonyabb hiba százalékkal működik. Algoritmusunk a jól ismert autokorreláciős módszeren alapszik. Algoritmusunk zöngésség detektáló erejét egy olyan adatbázison vizsgáltuk, melyben a beszéddel szinkronban laryngográf jelet is rögzítettek

University of Szeged