948 research outputs found

    A computational model for studying L1’s effect on L2 speech learning

    Get PDF
    abstract: Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. To test the hypotheses, this study comes up with a computational model to analyze the accented speech properties in both segmental (short-term speech measurements on short-segment or phoneme level) and suprasegmental (long-term speech measurements on word, long-segment, or sentence level) feature space. The benefit of using a computational model is that it enables quantitative analysis of L1's effect on accent in terms of different phonological properties. The core parts of this computational model are feature extraction schemes to extract pronunciation and prosody representation of accented speech based on existing techniques in speech processing field. Correlation analysis on both segmental and suprasegmental feature space is conducted to look into the relationship between acoustic measurements related to L1s and perceived accentedness across several L1s. Multiple regression analysis is employed to investigate how the L1's effect impacts the perception of foreign accent, and how accented speech produced by speakers from different L1s behaves distinctly on segmental and suprasegmental feature spaces. Results unveil the potential application of the methodology in this study to provide quantitative analysis of accented speech, and extend current studies in L2 speech learning theory to large scale. Practically, this study further shows that the computational model proposed in this study can benefit automatic accentedness evaluation system by adding features related to speakers' L1s.Dissertation/ThesisDoctoral Dissertation Speech and Hearing Science 201

    Development of Kinematic Templates for Automatic Pronunciation Assessment Using Acoustic-to-Articulatory Inversion

    Get PDF
    Computer-aided pronunciation training (CAPT) is a subcategory of computer-aided language learning (CALL) that deals with the correction of mispronunciation during language learning. For a CAPT system to be effective, it must provide useful and informative feedback that is comprehensive, qualitative, quantitative, and corrective. While the majority of modern systems address the first 3 aspects of feedback, most of these systems do not provide corrective feedback. As part of the National Science Foundation (NSF) funded study “RI: Small: Speaker Independent Acoustic-Articulator Inversion for Pronunciation Assessment”, the Marquette Speech and Swallowing Lab and Marquette Speech and Signal Processing Lab are conducting a pilot study on the feasibility of the use of acoustic-to-articulatory inversion for CAPT. In order to evaluate the results of a speaker’s acoustic-to-articulatory inversion to determine pronunciation accuracy, kinematic templates are required. The templates would represent the vowels, consonant clusters, and stress characteristics of a typical American English (AE) speaker in the midsagittal plane. The Marquette University electromagnetic articulography Mandarin-accented English (EMA-MAE) database, which contains acoustic and kinematic speech data for 40 speakers (20 of which are native AE speakers), provides the data used to form the kinematic templates. The objective of this work is the development and implementation of these templates. The data provided in the EMA-MAE database is analyzed in detail, and the information obtained from the analysis is used to develop the kinematic templates. The vowel templates are designed as sets of concentric confidence ellipses, which specify (in the midsagittal plane) the ranges of tongue and lip positions corresponding to correct pronunciation. These ranges were defined using the typical articulator positioning of all English speakers of the EMA-MAE database. The data from these English speakers were also used to model the magnitude, speed history, movement pattern, and duration (MSTD) features of each consonant cluster in the EMA-MAE corpus. Cluster templates were designed as set of average MSTD parameters across English speakers for each cluster. Finally, English stress characteristics were similarly modeled as a set of average magnitude, speed, and duration parameters across English speakers. The kinematic templates developed in this work, while still in early stages, form the groundwork for assessment of features returned by the acoustic-to-articulatory inversion system. This in turn allows for assessment of articulatory inversion as a pronunciation training tool

    From communicative functions to prosodic forms

    Get PDF
    This is a proposal in favour of proceeding from communicative function to linguistic form, rather than the reverse, for an insightful account of how humans communicate by speech in languages. A functional framework is developed that encompasses argumentation structures, declarative and interrogative functions, and expressive intensification. Such a function orientation can become a powerful tool in comparative prosodic research across the world's languages. The potential of this approach is shown by comparing the prosodic form of Mandarin Chinese data collected in functionally contextualized scenarios with corresponding data from English and German

    The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) Corpus of Acoustic and 3D Articulatory Kinematic Data

    Get PDF
    There is a significant need for more comprehensive electromagnetic articulography (EMA) datasets that can provide matched acoustics and articulatory kinematic data with good spatial and temporal resolution. The Marquette University Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus provides kinematic and acoustic data from 40 gender and dialect balanced speakers representing 20 Midwestern standard American English L1 speakers and 20 Mandarin Accented English (MAE) L2 speakers, half Beijing region dialect and half are Shanghai region dialect. Three dimensional EMA data were collected at a 400 Hz sampling rate using the NDI Wave system, with articulatory sensors on the midsagittal lips, lower incisors, tongue blade and dorsum, plus lateral lip corner and tongue body. Sensors provide three-dimensional position data as well as two-dimensional orientation data representing the orientation of the sensor plane. Data have been corrected for head movement relative to a fixed reference sensor and also adjusted using a biteplate calibration system to place the data in an articulatory working space relative to each subject\u27s individual midsagittal and maxillary occlusal planes. Speech materials include isolated words chosen to focus on specific contrasts between the English and Mandarin languages, as well as sentences and paragraphs for continuous speech, totaling approximately 45 minutes of data per subject. A beta version of the EMA-MAE corpus is now available, and the full corpus is in preparation for public release to help advance research in areas such as pronunciation modeling, acoustic-articulatory inversion, L1-L2 comparisons, pronunciation error detection, and accent modification training

    Topics in the Mandarin Lian...dou construction: its syntax and acquisition

    Full text link
    This dissertation investigates the Mandarin lian…dou construction (roughly the equivalent of the English even-construction) from two perspectives: its syntax and acquisition. The research questions pursued in this dissertation are: 1) What drives the syntactic movement in the lian…dou construction? 2) How does the even-like interpretation arise in the lian…dou construction? and 3) What constitutes children’s knowledge of the implicatures of the lian…dou construction? With respect to the first question, following (Fanselow & Lenertová, 2011), it is proposed that the movement in the lian…dou construction is driven by an unselective edge feature of CP (Chomsky, 2008) and is subject to the locality constraint of accentuation that bans the movement of a phrase with structural accent across another phrase with the same type of accent. The advantage of this proposal is that it can explain the partial focus movement seen in unusual lian…dou sentences where the VP or the whole clause is the focus. Furthermore, I propose that dou is an adverb above TP. Lian selects an EdgeP. The Edge head has the unselective edge feature that picks and moves a constituent to SpecEdgeP. With respect to the second question, under Y. Xiang’s (2019) analysis, the alternatives for dou can be construed in two ways, either in terms of likelihood or in terms of innocent excludability (Fox, 2007). In the former case, dou becomes a mirative marker (or at least it expresses relative surprise; the prejacent is said to be less likely than some alternative). But Y. Xiang does not provide a method for choosing among these options. The mirativity corresponding to the unselective edge feature, I argue, is what disambiguates dou in favor of its mirative instantiation, forcing the alternatives to be construed in terms of likelihood. As for children’s knowledge of the two meaning components of the lian…dou construction: the existential implication (that alternatives exist) and the scalar implication (that the mentioned alternative is the least likely) (Karttunen & Peters, 1979), the results of an experimental study show that even 6-year-old children were generally not able to compute either of them. It is proposed that children’s failure with the meaning components of lian…dou was due to their limited cognitive resources and the excessive task demands

    Spectral and temporal features of tense-lax vowel contrast produced by Cantonese speakers of English: a comparative study

    Get PDF
    A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2007.Also available in print.Thesis (B.Sc)--University of Hong Kong, 2007.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Spectral and temporal features of tense-lax vowel contrast produced by Cantonese speakers of English: a comparative study

    Get PDF
    A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2007.Also available in print.Thesis (B.Sc)--University of Hong Kong, 2007.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Prosodic Focus Within and Across Languages

    Get PDF
    The fact that purely prosodic marking of focus may be weaker in some languages than in others, and that it varies in certain circumstances even within a single language, has not been commonly recognized. Therefore, this dissertation investigated whether and how purely prosodic marking of focus varies within and across languages. We conducted production and perception experiments using a paradigm of 10-digit phone-number strings in which the same material and discourse contexts were used in different languages. The results demonstrated that prosodic marking of focus varied across languages. Speakers of American English, Mandarin Chinese, and Standard French clearly modulated duration, pitch, and intensity to indicate the position of corrective focus. Listeners of these languages recognized the focus position with high accuracy. Conversely, speakers of Seoul Korean, South Kyungsang Korean, Tokyo Japanese, and Suzhou Wu produced a weak and ambiguous modulation by focus, resulting in a poor identification performance. This dissertation also revealed that prosodic marking of focus varied even within a single language. In Mandarin Chinese, a focused low/dipping tone (tone 3) received a relatively poor identification rate compared to other focused tones (about 77% vs. 91%). This lower identification performance was due to the smaller capacity of tone 3 for pitch range expansion and local dissimilatory effects around tone 3 focus. In Seoul Korean, prosodic marking of focus differed based on the tonal contrast (post-lexical low vs. high tones). The identification rate of high tones was twice as high than that of low tones (about 24% vs. 51%), the reason being that low tones had a smaller capacity for pitch range expansion than high tones. All things considered, this dissertation demonstrates that prosodic focus is not always expressed by concomitant increased duration, pitch, and intensity. Accordingly, purely prosodic marking of focus is neither completely universal nor automatic, but rather is expressed through the prosodic structure of each language. Since the striking difference in focus-marking success does not seem to be determined by any previously-described typological feature, this must be regarded as an indicator of a new typological dimension, or as a function of a new typological space
    • …
    corecore