Search CORE

282 research outputs found

Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions

Author: Boyce Suzanne E.
Espy-Wilson Carol Y.
Guenther Frank H.
Matthies Melanie L.
Perkell Joseph S.
Zandipour Majid
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/07/1997
Field of study

The purpose of this report is to test the hypothesis that speakers utilize an acoustic, rather than articulatory, planning space for speech production. It has been well-documented that many speakers of American English use different tongue configurations to produce /r/ in different phonetic contexts. The acoustic planning hypothesis suggests that although the /r/ configuration varies widely in different contexts, the primary acoustic cue for /r/, a dip in the F3 trajectory, will be less variable due to tradeoffs in articulatory variability, or trading relations, that help maintain a relatively constant F3 trajectory across phonetic contexts. Acoustic data and EMMA articulatory data from seven speakers producing /r/ in different phonetic contexts were analyzed. Visual inspection of the EMMA data at the point of F3 minimum revealed that each speaker appeared to use at least two of three trading relation strategies that would be expected to reduce F3 variability. Articulatory covariance measures confirmed that all seven speakers utilized a trading relation between tongue back height and tongue back horizontal position, six speakers utilized a trading relation between tongue tip height and tongue back height, and the speaker who did not use this latter strategy instead utilized a trading relation between tongue tip height and tongue back horizontal position. Estimates of F3 variability with and without the articulatory covariances indicated that F3 would be much higher for all speakers if the articulatory covariances were not utilized. These conclusions were further supported by a comparison of measured F3 variability to F3 variabilities estimated from the pellet data with and without articulatory covariances. In all subjects, the actual F3 variance was significantly lower than the F3 variance estimated without articulatory covariances, further supporting the conclusion that the articulatory trading relations were being used to reduce F3 variability. Together, these results strongly suggest that the neural control mechanisms underlying speech production make elegant use of trading relations between articulators to maintain a relatively invariant acoustic trace for /r/ across phonetic contexts

Boston University Institutional Repository (OpenBU)

Analyzing liquids

Author: Lawson E.
Maclagan M.
Scobbie J.M.
Stuart-Smith J.
Yaeger-Dror M.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

Enlighten

Reading aloud begins when the computation of phonology is complete.

Author: Jonathan Harrington
Kathleen Rastle
Max Coltheart
Sallyanne Palethorpe
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2002
Field of study

Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions

Author: Guenther Frank H.
Espy-Wilson Carol Y.
Boyce Suzanne E.
Matthies Melanie L.
Zandipour Majid
Perkell Joseph S.
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1860
Field of study

Boston University Institutional Repository (OpenBU)

Back from the future:Nonlinear anticipation in adults and children's speech

Author: Abakarova Dzhuma
Noiray Aude
Rubertus Eline
Tiede Mark
Wieling Martijn
Publication venue: 'American Speech Language Hearing Association'
Publication date: 29/08/2019
Field of study

Purpose: This study examines the temporal organization of vocalic anticipation in German children from 3 to 7 years of age and adults. The main objective was to test for non-linearprocesses in vocalic anticipation, which may result from the interaction between lingualgestural goalsfor individual vowels, and those for their neighbors over time. Method: The technique of ultrasound imaging was employed to record tongue movement at fivetimepoints throughout short utterances of the form V1#CV2. Vocalic anticipation was examined with Generalized Additive Modeling, an analytical approach allowing forthe estimation of both linear and non-linearinfluences on anticipatoryprocesses. Results: both adults and children exhibit non-linear patterns of vocalic anticipation over time with the degree and extent of vocalic anticipation varying as a function of the individual consonants and vowels assembled. However, noticeable developmental discrepancieswere found with vocalic anticipation being present earlier in children ́sutterances at 3-4-5 years of agein comparison to adults and to some extent 7-year-old children.Conclusions: Anarrowing of speech production organization from large chunks in kindergarten to more contextually-specified organizationsseems to occur fromkindergarten toprimary school toadulthood, although variation in the temporal overlap of lingual gestures for consecutive segments is already present in the youngestcohorts. In adults, non-linear anticipatory patterns over time suggest a strong differentiation between the gestural goals for consecutive segments. In children, this differentiation is not yet mature: vowelsshow greater prominence over time and seem activated more in-phase with those of previous segments relative to adults

Proceedings - University of Groningen

Dissertations of the University of Groningen

Vowel Production in Mandarin Accented English and American English: Kinematic and Acoustic Data from the Marquette University Mandarin Accented English Corpus

Author: Berry Jeffrey J.
Ji An
Johnson Michael T
Publication venue: e-Publications@Marquette
Publication date: 01/01/2013
Field of study

Few electromagnetic articulography (EMA) datasets are publicly available, and none have focused systematically on non-native accented speech. We introduce a kinematic-acoustic database of speech from 40 (gender and dialect balanced) participants producing upper-Midwestern American English (AE) L1 or Mandarin Accented English (MAE) L2 (Beijing or Shanghai dialect base). The Marquette University EMA-MAE corpus will be released publicly to help advance research in areas such as pronunciation modeling, acoustic-articulatory inversion, L1-L2 comparisons, pronunciation error detection, and accent modification training. EMA data were collected at a 400 Hz sampling rate with synchronous audio using the NDI Wave System. Articulatory sensors were placed on the midsagittal lips, lower incisors, and tongue blade and dorsum, as well as on the lip corner and lateral tongue body. Sensors provide five degree-of-freedom measurements including three-dimensional sensor position and two-dimensional orientation (pitch and roll). In the current work we analyze kinematic and acoustic variability between L1 and L2 vowels. We address the hypothesis that MAE is characterized by larger differences in the articulation of back vowels than front vowels and smaller vowel spaces compared to AE. The current results provide a seminal comparison of the kinematics and acoustics of vowel production between MAE and AE speakers

Vowel nasalization in German

Author: Kunay Esther
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 09/11/2021
Field of study

Digitale Hochschulschriften der LMU

Articulatory Tradeoffs Reduce Acoustic Variability During American English /r/ Production

Author: Browman C.
Carol Y. Espy-Wilson
Delattre P.
Fowler C. A.
Frank H. Guenther
Hagiwara R.
Hagiwara R.
Joseph S. Perkell
Majid Zandipour
Melanie L. Matthies
Ong D.
Suzanne E. Boyce
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1998
Field of study

Acoustic and articulatory recordings reveal that speakers utilize systematic articulatory tradeoffs to maintain acoustic stability when producing the phoneme /r/. Distinct articulator configurations used to produce /r/ in various phonetic contexts show systematic tradeoffs between the cross-sectional areas of different vocal tract sections. Analysis of acoustic and articulatory variabilities reveals that these tradeoffs act to reduce acoustic variability, thus allowing large contextual variations in vocal tract shape; these contextual variations in turn apparently reduce the amount of articulatory movement required. These findings contrast with the widely held view that speaking involves a canonical vocal tract shape target for each phoneme.National Institute on Deafness and Other Communication Disorders (1R29-DC02852-02, 5R01-DC01925-04, 1R03-C2576-0l); National Science Foundation (IRI-9310518

Boston University Institutional Repository (OpenBU)

ARTICULATORY INFORMATION FOR ROBUST SPEECH RECOGNITION

Author: Mitra Vikramjit
Publication venue
Publication date: 01/01/2010
Field of study

Current Automatic Speech Recognition (ASR) systems fail to perform nearly as good as human speech recognition performance due to their lack of robustness against speech variability and noise contamination. The goal of this dissertation is to investigate these critical robustness issues, put forth different ways to address them and finally present an ASR architecture based upon these robustness criteria. Acoustic variations adversely affect the performance of current phone-based ASR systems, in which speech is modeled as `beads-on-a-string', where the beads are the individual phone units. While phone units are distinctive in cognitive domain, they are varying in the physical domain and their variation occurs due to a combination of factors including speech style, speaking rate etc.; a phenomenon commonly known as `coarticulation'. Traditional ASR systems address such coarticulatory variations by using contextualized phone-units such as triphones. Articulatory phonology accounts for coarticulatory variations by modeling speech as a constellation of constricting actions known as articulatory gestures. In such a framework, speech variations such as coarticulation and lenition are accounted for by gestural overlap in time and gestural reduction in space. To realize a gesture-based ASR system, articulatory gestures have to be inferred from the acoustic signal. At the initial stage of this research an initial study was performed using synthetically generated speech to obtain a proof-of-concept that articulatory gestures can indeed be recognized from the speech signal. It was observed that having vocal tract constriction trajectories (TVs) as intermediate representation facilitated the gesture recognition task from the speech signal. Presently no natural speech database contains articulatory gesture annotation; hence an automated iterative time-warping architecture is proposed that can annotate any natural speech database with articulatory gestures and TVs. Two natural speech databases: X-ray microbeam and Aurora-2 were annotated, where the former was used to train a TV-estimator and the latter was used to train a Dynamic Bayesian Network (DBN) based ASR architecture. The DBN architecture used two sets of observation: (a) acoustic features in the form of mel-frequency cepstral coefficients (MFCCs) and (b) TVs (estimated from the acoustic speech signal). In this setup the articulatory gestures were modeled as hidden random variables, hence eliminating the necessity for explicit gesture recognition. Word recognition results using the DBN architecture indicate that articulatory representations not only can help to account for coarticulatory variations but can also significantly improve the noise robustness of ASR system

CiteSeerX

The interaction between articulation and tones in Cantonese

Author: Ip Wai-yin
葉慧賢
Publication venue: The University of Hong Kong (Pokfulam, Hong Kong)
Publication date: 01/01/2009
Field of study

"A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2009."Thesis (B.Sc)--University of Hong Kong, 2009.Includes bibliographical references (p. 27-30).published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

HKU Scholars Hub