Search CORE

553 research outputs found

The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) Corpus of Acoustic and 3D Articulatory Kinematic Data

Author: Berry Jeffrey J.
Ji An
Johnson Michael T
Publication venue: e-Publications@Marquette
Publication date: 04/05/2014
Field of study

There is a significant need for more comprehensive electromagnetic articulography (EMA) datasets that can provide matched acoustics and articulatory kinematic data with good spatial and temporal resolution. The Marquette University Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus provides kinematic and acoustic data from 40 gender and dialect balanced speakers representing 20 Midwestern standard American English L1 speakers and 20 Mandarin Accented English (MAE) L2 speakers, half Beijing region dialect and half are Shanghai region dialect. Three dimensional EMA data were collected at a 400 Hz sampling rate using the NDI Wave system, with articulatory sensors on the midsagittal lips, lower incisors, tongue blade and dorsum, plus lateral lip corner and tongue body. Sensors provide three-dimensional position data as well as two-dimensional orientation data representing the orientation of the sensor plane. Data have been corrected for head movement relative to a fixed reference sensor and also adjusted using a biteplate calibration system to place the data in an articulatory working space relative to each subject\u27s individual midsagittal and maxillary occlusal planes. Speech materials include isolated words chosen to focus on specific contrasts between the English and Mandarin languages, as well as sentences and paragraphs for continuous speech, totaling approximately 45 minutes of data per subject. A beta version of the EMA-MAE corpus is now available, and the full corpus is in preparation for public release to help advance research in areas such as pronunciation modeling, acoustic-articulatory inversion, L1-L2 comparisons, pronunciation error detection, and accent modification training

epublications@Marquette

Crossref

Nasal Coda Loss in the Chengdu Dialect of Mandarin: Evidence from RT-MRI

Author: Bucar Shigemori Lia Saki
Cui Aletheia
Cunha Conceição
Frahm Jens
Harrington Jonathan
Hoole Philip
Kleber Felicitas
Kunay Esther
Liao Sishi
Voit Dirk
Publication venue
Publication date: 01/01/2022
Field of study

Open Access LMU

The relationship between vowel change and nasal loss in the Chengdu dialect of Mandarin: evidence from RT-MRI

Author: Harrington Jonathan
Hoole Philip
Liao Sishi
Publication venue
Publication date: 01/01/2023
Field of study

Open Access LMU

Coupling between the laryngeal and supralaryngeal systems

Author: Lee Wang-ting
李宏婷
Publication venue: The University of Hong Kong (Pokfulam, Hong Kong)
Publication date: 01/01/2010
Field of study

Includes bibliographical references (p. 27-30)."A dissertation submitted in partial fulfillment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2010."Thesis (B.Sc)--University of Hong Kong, 2010.The present study investigated the coupling between the laryngeal and supralaryngeal systems in speech production. The interrelationship between the two systems was examined by studying the possible interaction between tone production (laryngeal system) and articulation (supralaryngeal system). Sixty (30 male and 30 female) native Cantonese speakers participated in the study. The first and second formant frequencies (F1 and F2) associated with the four vowels /i, u, ?, ?/ produced at six Cantonese lexical tones (highlevel, high-rising, mid-level, low-falling, low-rising and low-level tones) were obtained. Results revealed that, regardless of vowels, significant articulatory changes were found when produced at different tones. However, the difference pattern across each vowel was not systematic. Gender difference was also noted; male and female speakers showed different patterns in articulatory changes. These findings revealed the coupling effect between the laryngeal and supra-laryngeal systems.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

HKU Scholars Hub

Quantification of vocal tract configuration of laryngectomees by acoustic reflection technology (ART)

Author: Chan Venus
陳琪琪
Publication venue: The University of Hong Kong (Pokfulam, Hong Kong)
Publication date: 01/01/2011
Field of study

This study compared the vocal tract configuration, including the length and volume, of alaryngeal and laryngeal speakers. Thirty alaryngeal speakers and 30 laryngeal speakers were recruited for the study. Pharyngometry, which is an acoustic reflection technology (ART), was used to measure the vocal tract parameters of the participants. Results showed that there was no significant difference in the length and volume of the vocal tract of the alaryngeal and laryngeal speakers. The finding suggested that the difference in the formant frequency during vowel production by alaryngeal and laryngeal speakers may be due to factors other than vocal tract configuration. The finding also suggested that the independence of the source and the filter (Fant, 1960; Pickett, 1999) may not be applicable to alaryngeal speakers.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

HKU Scholars Hub

Experience with foreign accent influences non-native (L2) word recognition: The case of th-substitutions [Abstract]

Author: Hanulikova A.
Weber A.
Publication venue
Publication date: 01/04/2009
Field of study

MPG.PuRe

Speaker Independent Acoustic-to-Articulatory Inversion

Author: Ji An
Publication venue: e-Publications@Marquette
Publication date: 01/10/2014
Field of study

Acoustic-to-articulatory inversion, the determination of articulatory parameters from acoustic signals, is a difficult but important problem for many speech processing applications, such as automatic speech recognition (ASR) and computer aided pronunciation training (CAPT). In recent years, several approaches have been successfully implemented for speaker dependent models with parallel acoustic and kinematic training data. However, in many practical applications inversion is needed for new speakers for whom no articulatory data is available. In order to address this problem, this dissertation introduces a novel speaker adaptation approach called Parallel Reference Speaker Weighting (PRSW), based on parallel acoustic and articulatory Hidden Markov Models (HMM). This approach uses a robust normalized articulatory space and palate referenced articulatory features combined with speaker-weighted adaptation to form an inversion mapping for new speakers that can accurately estimate articulatory trajectories. The proposed PRSW method is evaluated on the newly collected Marquette electromagnetic articulography - Mandarin Accented English (EMA-MAE) corpus using 20 native English speakers. Cross-speaker inversion results show that given a good selection of reference speakers with consistent acoustic and articulatory patterns, the PRSW approach gives good speaker independent inversion performance even without kinematic training data

epublications@Marquette

The articulatory and acoustic characteristics of Polish sibilants and their consequences for diachronic change

Author: Bukmaier Véronique
Harrington Jonathan
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2016
Field of study

The study is concerned with the relative synchronic stability of three contrastive sibilant fricatives /s (sic)/ in Polish. Tongue movement data were collected from nine first-language Polish speakers producing symmetrical real and non-word CVCV sequences in three vowel contexts. A Gaussian model was used to classify the sibilants from spectral information in the noise and from formant frequencies at vowel onset. The physiological analysis showed an almost complete separation between /s (sic)/ on tongue-tip parameters. The acoustic analysis showed that the greater energy at higher frequencies distinguished /s/ in the fricative noise from the other two sibilant categories. The most salient information at vowel onset was for /(sic)/, which also had a strong palatalizing effect on the following vowel. Whereas either the noise or vowel onset was largely sufficient for the identification of /s (sic)/ respectively, both sets of cues were necessary to separate /(sic)/ from /s (sic)/. The greater synchronic instability of /(sic)/ may derive from its high articulatory complexity coupled with its comparatively low acoustic salience. The data also suggest that the relatively late stage of /(sic)/ acquisition by children may come about because of the weak acoustic information in the vowel for its distinction from /s/

Crossref

Open Access LMU

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Author: Bliesener Yannick
Byrd Dani
Chen Weiyi
Godinez Bianca
Goldstein Louis
Harper Sarah
Lee Yoonjeong
Lim Yongwan
Lingala Sajan Goud
Montesserin Mairym Lloréns
Narayanan Shrikanth S.
Nayak Krishna S.
Oh Miran
Smith Caitlin
Sorensen Tanner
Tian Ye
Toutios Asterios
Töger Johannes
Vaz Colin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2021
Field of study

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 subjects performing linguistically motivated speech tasks, alongside the corresponding first-ever public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each subject.Comment: 27 pages, 6 figures, 5 tables, submitted to Nature Scientific Dat

arXiv.org e-Print Archive

Directory of Open Access Journals