Search CORE

283 research outputs found

Articulatory-WaveNet: Deep Autoregressive Model for Acoustic-to-Articulatory Inversion

Author: Agha Seyed Mirza Bozorg Narjes Alsadat
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

Acoustic-to-Articulatory Inversion, the estimation of articulatory kinematics from speech, is an important problem which has received significant attention in recent years. Estimated articulatory movements from such models can be used for many applications, including speech synthesis, automatic speech recognition, and facial kinematics for talking-head animation devices. Knowledge about the position of the articulators can also be extremely useful in speech therapy systems and Computer-Aided Language Learning (CALL) and Computer-Aided Pronunciation Training (CAPT) systems for second language learners. Acoustic-to-Articulatory Inversion is a challenging problem due to the complexity of articulation patterns and significant inter-speaker differences. This is even more challenging when applied to non-native speakers without any kinematic training data. This dissertation attempts to address these problems through the development of up-graded architectures for Articulatory Inversion. The proposed Articulatory-WaveNet architecture is based on a dilated causal convolutional layer structure that improves the Acoustic-to-Articulatory Inversion estimated results for both speaker-dependent and speaker-independent scenarios. The system has been evaluated on the ElectroMagnetic Articulography corpus of Mandarin Accented English (EMA-MAE) corpus, consisting of 39 speakers including both native English speakers and Mandarin accented English speakers. Results show that Articulatory-WaveNet improves the performance of the speaker-dependent and speaker-independent Acoustic-to-Articulatory Inversion systems significantly compared to the previously reported results

University of Kentucky

Development of Kinematic Templates for Automatic Pronunciation Assessment Using Acoustic-to-Articulatory Inversion

Author: Jones Deriq K.
Publication venue: e-Publications@Marquette
Publication date: 01/07/2017
Field of study

Computer-aided pronunciation training (CAPT) is a subcategory of computer-aided language learning (CALL) that deals with the correction of mispronunciation during language learning. For a CAPT system to be effective, it must provide useful and informative feedback that is comprehensive, qualitative, quantitative, and corrective. While the majority of modern systems address the first 3 aspects of feedback, most of these systems do not provide corrective feedback. As part of the National Science Foundation (NSF) funded study “RI: Small: Speaker Independent Acoustic-Articulator Inversion for Pronunciation Assessment”, the Marquette Speech and Swallowing Lab and Marquette Speech and Signal Processing Lab are conducting a pilot study on the feasibility of the use of acoustic-to-articulatory inversion for CAPT. In order to evaluate the results of a speaker’s acoustic-to-articulatory inversion to determine pronunciation accuracy, kinematic templates are required. The templates would represent the vowels, consonant clusters, and stress characteristics of a typical American English (AE) speaker in the midsagittal plane. The Marquette University electromagnetic articulography Mandarin-accented English (EMA-MAE) database, which contains acoustic and kinematic speech data for 40 speakers (20 of which are native AE speakers), provides the data used to form the kinematic templates. The objective of this work is the development and implementation of these templates. The data provided in the EMA-MAE database is analyzed in detail, and the information obtained from the analysis is used to develop the kinematic templates. The vowel templates are designed as sets of concentric confidence ellipses, which specify (in the midsagittal plane) the ranges of tongue and lip positions corresponding to correct pronunciation. These ranges were defined using the typical articulator positioning of all English speakers of the EMA-MAE database. The data from these English speakers were also used to model the magnitude, speed history, movement pattern, and duration (MSTD) features of each consonant cluster in the EMA-MAE corpus. Cluster templates were designed as set of average MSTD parameters across English speakers for each cluster. Finally, English stress characteristics were similarly modeled as a set of average magnitude, speed, and duration parameters across English speakers. The kinematic templates developed in this work, while still in early stages, form the groundwork for assessment of features returned by the acoustic-to-articulatory inversion system. This in turn allows for assessment of articulatory inversion as a pronunciation training tool

epublications@Marquette

Modeling the development of pronunciation in infant speech acquisition.

Author: Howard IS
Messum P
Publication venue: 'United States Sports Academy'
Publication date: 01/01/2011
Field of study

Pronunciation is an important part of speech acquisition, but little attention has been given to the mechanism or mechanisms by which it develops. Speech sound qualities, for example, have just been assumed to develop by simple imitation. In most accounts this is then assumed to be by acoustic matching, with the infant comparing his output to that of his caregiver. There are theoretical and empirical problems with both of these assumptions, and we present a computational model- Elija-that does not learn to pronounce speech sounds this way. Elija starts by exploring the sound making capabilities of his vocal apparatus. Then he uses the natural responses he gets from a caregiver to learn equivalence relations between his vocal actions and his caregiver's speech. We show that Elija progresses from a babbling stage to learning the names of objects. This demonstrates the viability of a non-imitative mechanism in learning to pronounce

Plymouth Electronic Archive and Research Library

Development and Clinical Application of Instruments to Measure Orofacial Structures

Author: Amanda Freitas Valentim
Andréa Rodrigues Motta
Cláudio Gomes da Costa
Estevam Barbosa de Las Casas
Iracema Maria Utsch Braga
Monalise Costa Batista Berbert
Márcio Falcão Santos Barroso
Renata Maria Moreira Moraes Furlan
Tatiana Vargas de Castro Perilo
Publication venue: 'IntechOpen'
Publication date: 23/03/2012
Field of study

IntechOpen

Phonological problems in teaching French to American high school students

Author: McLaughlin B. Marjorie.
Publication venue
Publication date: 01/01/1968
Field of study

Call number: LD2668 .R4 1968 M325

K-State Research Exchange

Visible movements of the orofacial area: evidence for gestural or multimodal theories of language evolution?

Author: Orzechowski Sylwester
Wacewicz Sławomir
Żywiczyński Przemysław
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2016
Field of study

The age-old debate between the proponents of the gesture-first and speech-first positions has returned to occupy a central place in current language evolution theorizing. The gestural scenarios, suffering from the problem known as “modality transition” (why a gestural system would have changed into a predominantly spoken system), frequently appeal to the gestures of the orofacial area as a platform for this putative transition. Here, we review currently available evidence on the significance of the orofacial area in language evolution. While our review offers some support for orofacial movements as an evolutionary “bridge” between manual gesture and speech, we see the evidence as far more consistent with a multimodal approach. We also suggest that, more generally, the “gestural versus spoken” formulation is limiting and would be better expressed in terms of the relative input and interplay of the visual and vocal-auditory sensory modalities

Repository of Nicolaus Copernicus University

An introduction to English phonetics for the Estonian learner

Author: Mutt Oleg
Publication venue: Tartu Ülikool
Publication date: 01/01/1971
Field of study

http://www.ester.ee/record=b1355649*es

DSpace at Tartu University Library

The Effectiveness of Articulatory Approach in Improving First Semester Students' Pronunciation Competence of English Education Department at UIN Alauddin Makassar

Author: Syahrir Musayyadah
Publication venue
Publication date: 01/01/2016
Field of study

The data were analyzed using descriptive statistic (frequency, mean score, and standard deviation) and inferential statistic (independent sample t-test). The research found out and concluded that the students’ pronunciation competence improved through applying articulatory approach by the increase of mean score of experimental class that is 48.81 in the pretest and 68.42 in the posttest. The result of the t-test also shown that the articulatory approach is effective in improving students’ pronunciation competence because the t-test, 2.296, is higher than t-table, 2.000 (2.296 > 2.000)

Repositori UIN Alauddin Makassar

The phonetics of colloquial Tamil

Author: Balasubramanian T.
Publication venue: The University of Edinburgh
Publication date: 01/01/1972
Field of study

Edinburgh Research Archive

Articulation in brass playing : the tongue - friend or foe?

Author: Ayers Angela Gillian
Publication venue: College of Music
Publication date: 01/01/2004
Field of study

Bibliography: leaves 97-99.This dissertation attempts to demonstrate the role the tongue plays in articulation in brass playing. It briefly examines oral anatomy, physiology and theories on motor learning, and describes the tongue's position in producing English speech sounds. It shows how these positions are used to teach different articulation techniques on the various brass instruments. Articulation styles and (tonguing) exercises, which could aid in the improvement of tongue articulation, are highlighted. It is hoped that these highlights will add insight for both present and future brass teachers

Cape Town University OpenUCT