404 research outputs found

    The relation between acoustic and articulatory variation in vowels : data from American and Australian English

    Get PDF
    In studies of dialect variation, the articulatory nature of vowels is sometimes inferred from formant values using the following heuristic: F1 is inversely correlated with tongue height and F2 is inversely correlated with tongue backness. This study compared vowel formants and corresponding lingual articulation in two dialects of English, standard North American English and Australian English. Five speakers of North American English and four speakers of Australian English were recorded producing multiple repetitions of ten monophthongs embedded in the /sVd/ context. Simultaneous articulatory data were collected using electromagnetic articulography. Results show that there are significant correlations between tongue position and formants in the direction predicted by the heuristic but also that the relations implied by the heuristic break down under specific conditions. Articulatory vowel spaces, based on tongue dorsum (TD) position, and acoustic vowel spaces, based on formants, show systematic misalignment due in part to the influence of other articulatory factors, including lip rounding and tongue curvature on formant values. Incorporating these dimensions into our dialect comparison yields a richer description and a more robust understanding of how vowel formant patterns are reproduced within and across dialects

    Categoriality and continuity in prosodic prominence

    Get PDF
    Prosody has been characterised as a "half-tamed savage" being shaped by both discrete, categorical aspects as well as gradient, continuous phenomena. This book is concerned with the relation of the "wild" and the "tamed" sides of prosodic prominence. It reviews problems that arise from a strict separation of categorical and continuous representations in models of phonetics and phonology, and it explores the potential role of descriptions aimed at reconciling the two domains. In doing so, the book offers an introduction to dynamical systems, a framework that has been studied extensively in the last decades to model speech production and perception. The reported acoustic and articulatory data presented in this book show that categorical and continuous modulations used to enhance prosodic prominence are deeply intertwined and even exhibit a kind of symbiosis. A multi-dimensional dynamical model of prosodic prominence is sketched, based on the empirical data, combining tonal and articulatory aspects of prosodic focus marking. The model demonstrates how categorical and continuous aspects can be inte- grated in a joint theoretical treatment that overcomes a strict separation of phonetics and phonology

    Introduction to Theoretical Phonetics of English

    Get PDF
    The study guide contains educational material on the main topics of the course of theoretical phonetics of English: the sound structure of the language and the ways of its description and analysis; features of the modern pronunciation norm of English as a polyethnic formation and its national and regional variants; sounds of English as articulatory and functional units; syllable as a phonetic and phonological unit, word emphasis; prosodic arrangement of English language. Questions and practical tasks for each unit provide an opportunity for self-study of educational material. Meant for students, graduate students, teachers, and all interested in learning English

    Vowel acquisition in a multidialectal environment: A five-year longitudinal case study

    Get PDF
    What happens when a child is exposed to multiple phonological systems while they are acquiring language? How do they resolve contradictory patterns in the accents around them in their own developing speech production? Do they acquire the accent of the local community, their parents’ accent, or something in between? This thesis examines the acquisition of a subset of vowels in a child growing up in a multidialectal environment. The child’s realisations of vowels in the lexical sets STRUT, FOOT, START, PALM and BATH are analysed between the ages of 2;01 and 6;11. Previous research has shown that while a child’s accent is usually heavily influenced by their peers, having parents from outside the local area can prevent complete acquisition of an accent. Local cultural values, whether or not a parent’s accent has more prestigious elements than the local one, a child’s personality, and the complexity of the relationship between the home and local phonological systems have all been implicated in whether or not a child fully acquires a local accent. In the child studied here, a shift from the vowels used at home to local variants always happened at the level of articulatory feature, rather than at phonemic level, in the first instance, and vowels belonging to different lexical sets were acquired at different rates. This thesis demonstrates that acquisition of these vowels takes many years, as combinations of articulatory features stabilise. Moreover, even once a local variant has apparently been acquired, the variety of language spoken at home can leave a phonetic legacy in a child’s accent. Naturalistic data collection combined with impressionistic and acoustic analysis in conjunction with a long and sustained data collection period reveals patterns in this child’s phonological acquisition not seen in any previous research in this detail

    Individual Differences in Speech Production and Perception

    Get PDF
    Inter-individual variation in speech is a topic of increasing interest both in human sciences and speech technology. It can yield important insights into biological, cognitive, communicative, and social aspects of language. Written by specialists in psycholinguistics, phonetics, speech development, speech perception and speech technology, this volume presents experimental and modeling studies that provide the reader with a deep understanding of interspeaker variability and its role in speech processing, speech development, and interspeaker interactions. It discusses how theoretical models take into account individual behavior, explains why interspeaker variability enriches speech communication, and summarizes the limitations of the use of speaker information in forensics

    Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed

    Get PDF
    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory

    Magnetic resonance imaging of the brain and vocal tract:Applications to the study of speech production and language learning

    Get PDF
    The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic “talent”. In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI – specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions

    Statistical Parametric Methods for Articulatory-Based Foreign Accent Conversion

    Get PDF
    Foreign accent conversion seeks to transform utterances from a non-native speaker (L2) to appear as if they had been produced by the same speaker but with a native (L1) accent. Such accent-modified utterances have been suggested to be effective in pronunciation training for adult second language learners. Accent modification involves separating the linguistic gestures and voice-quality cues from the L1 and L2 utterances, then transposing them across the two speakers. However, because of the complex interaction between these two sources of information, their separation in the acoustic domain is not straightforward. As a result, vocoding approaches to accent conversion results in a voice that is different from both the L1 and L2 speakers. In contrast, separation in the articulatory domain is straightforward since linguistic gestures are readily available via articulatory data. However, because of the difficulty in collecting articulatory data, conventional synthesis techniques based on unit selection are ill-suited for accent conversion given the small size of articulatory corpora and the inability to interpolate missing native sounds in L2 corpus. To address these issues, this dissertation presents two statistical parametric methods to accent conversion that operate in the acoustic and articulatory domains, respectively. The acoustic method uses a cross-speaker statistical mapping to generate L2 acoustic features from the trajectories of L1 acoustic features in a reference utterance. Our results show significant reductions in the perceived non-native accents compared to the corresponding L2 utterance. The results also show a strong voice-similarity between accent conversions and the original L2 utterance. Our second (articulatory-based) approach consists of building a statistical parametric articulatory synthesizer for a non-native speaker, then driving the synthesizer with the articulators from the reference L1 speaker. This statistical approach not only has low data requirements but also has the flexibility to interpolate missing sounds in the L2 corpus. In a series of listening tests, articulatory accent conversions were rated more intelligible and less accented than their L2 counterparts. In the final study, we compare the two approaches: acoustic and articulatory. Our results show that the articulatory approach, despite the direct access to the native linguistic gestures, is less effective in reducing perceived non-native accents than the acoustic approach
    corecore