767 research outputs found

    Characterizing phonetic transformations and fine-grained acoustic differences across dialects

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 169-175).This thesis is motivated by the gaps between speech science and technology in analyzing dialects. In speech science, investigating phonetic rules is usually manually laborious and time consuming, limiting the amount of data analyzed. Without sufficient data, the analysis could potentially overlook or over-specify certain phonetic rules. On the other hand, in speech technology such as automatic dialect recognition, phonetic rules are rarely modeled explicitly. While many applications do not require such knowledge to obtain good performance, it is beneficial to specifically model pronunciation patterns in certain applications. For example, users of language learning software can benefit from explicit and intuitive feedback from the computer to alter their pronunciation; in forensic phonetics, it is important that results of automated systems are justifiable on phonetic grounds. In this work, we propose a mathematical framework to analyze dialects in terms of (1) phonetic transformations and (2) acoustic differences. The proposed Phonetic based Pronunciation Model (PPM) uses a hidden Markov model to characterize when and how often substitutions, insertions, and deletions occur. In particular, clustering methods are compared to better model deletion transformations. In addition, an acoustic counterpart of PPM, Acoustic-based Pronunciation Model (APM), is proposed to characterize and locate fine-grained acoustic differences such as formant transitions and nasalization across dialects. We used three data sets to empirically compare the proposed models in Arabic and English dialects. Results in automatic dialect recognition demonstrate that the proposed models complement standard baseline systems. Results in pronunciation generation and rule retrieval experiments indicate that the proposed models learn underlying phonetic rules across dialects. Our proposed system postulates pronunciation rules to a phonetician who interprets and refines them to discover new rules or quantify known rules. This can be done on large corpora to develop rules of greater statistical significance than has previously been possible. Potential applications of this work include speaker characterization and recognition, automatic dialect recognition, automatic speech recognition and synthesis, forensic phonetics, language learning or accent training education, and assistive diagnosis tools for speech and voice disorders.by Nancy Fang-Yih Chen.Ph.D

    An integrated dialect analysis tool using phonetics and acoustics

    Get PDF
    This study aimed to verify a computational phonetic and acoustic analysis tool created in the MATLAB environment. A dataset was obtained containing 3 broad American dialects (Northern, Western and New England) from the TIMIT database using words that also appeared in the Swadesh list. Each dialect consisted of 20 speakers uttering 10 sentences. Verification using phonetic comparisons between dialects was made by calculating the Levenshtein distance in Gabmap and the proposed software tool. Agreement between the linguistic distances using each analysis method was found. Each tool showed increasing linguistic distance as a function of increasing geographic distance, in a similar shape to Seguy's curve. The proposed tool was then further developed to include acoustic characterisation capability of inter dialect dynamics. Significant variation between dialects was found for the pitch, trajectory length and spectral rate of change for 7 of the phonetic vowels investigated. Analysis of the vowel area using the 4 corner vowels indicated that for male speakers, geographically closer dialects have smaller variations in vowel space area than those further apart. The female utterances did not show a similar pattern of linguistic distance likely due to the lack of one corner vowel /u/, making the vowel space a triangle

    Speaking Rate Effects on Locus Equation Slope

    Get PDF
    A locus equation describes a 1st order regression fit to a scatter of vowel steady-state frequency values predicting vowel onset frequency values. Locus equation coefficients are often interpreted as indices of coarticulation. Speaking rate variations with a constant consonant–vowel form are thought to induce changes in the degree of coarticulation. In the current work, the hypothesis that locus slope is a transparent index of coarticulation is examined through the analysis of acoustic samples of large-scale, nearly continuous variations in speaking rate. Following the methodological conventions for locus equation derivation, data pooled across ten vowels yield locus equation slopes that are mostly consistent with the hypothesis that locus equations vary systematically with coarticulation. Comparable analyses between different four-vowel pools reveal variations in the locus slope range and changes in locus slope sensitivity to rate change. Analyses across rate but within vowels are substantially less consistent with the locus hypothesis. Taken together, these findings suggest that the practice of vowel pooling exerts a non-negligible influence on locus outcomes. Results are discussed within the context of articulatory accounts of locus equations and the effects of speaking rate change

    Modelling phonologization: vowel reduction and epenthesis in Lunigiana dialects

    Get PDF
    Within a linguistic continuum, the further from the irradiation centre, the later a language is affected by a change; the later a language is reached by a change, the milder the outcomes. Building upon these wave-theoretic assumptions, this dissertation provides a formal description of the relationship between diatopic/diachronic micro-variation and phonologization. In particular, an analysis is performed of the phonetic/phonological properties of unstressed vowel reduction and vowel insertion in two Northern Italian dialects: Carrarese and Pontremolese. These dialects are argued to represent two frozen stages of these processes’ diffusion, Carrarese representing the diachronic stage Pontremolese has already gone through. Indeed, Pontremolese displays non-etymological vocoids that show the phonetic and phonological characteristics of epenthetic vowels and that, crucially, can be considered the phonologized correlates of Carrarese’s intrusive vocoids. These, in turn, should be rather considered articulatory/perceptually driven vowel-like releases. A formal account of this diatopic, diachronic and grammatical relationship is given that supports a modular grammar architecture, in which phonetics and phonology constitute, hence, two autonomous modules. Within such an architecture, the lateral forces (government and licensing) developed by standard Government Phonology are translated into violable constraints and inserted in a BiPhon grammar. In this optimality-theoretic grammar, the phonetics-phonology interface is managed by a set of cue constraints that map acoustic dimensions (formant structures) onto phonological primitives (elements). Furthermore, to integrate morphological information in the phonological forms, the Coloured Containment Theory is resorted to. This dissertation is of relevance to anyone interested in diatopic/diachronic micro-variation, phonologization, phonological theory and Italian dialectology

    Modelling phonologization: vowel reduction and epenthesis in Lunigiana dialects

    Get PDF
    Building upon wave-theoretic assumptions, this dissertation provides a formal description of the relationship between diatopic/diachronic micro-variation and phonologization. In particular, an analysis is performed of the phonetic/phonological properties of unstressed vowel reduction and vowel insertion in two Northern Italian dialects: Carrarese and Pontremolese. These dialects are argued to represent two frozen stages of these processes’ diffusion, Carrarese representing the diachronic stage Pontremolese has already gone through. Indeed, Pontremolese displays non-etymological vocoids that show the phonetic and phonological characteristics of epenthetic vowels and that, crucially, can be considered the phonologized correlates of Carrarese’s intrusive vocoids. These, in turn, should be rather considered articulatory/perceptually driven vowel-like releases. A formal account of this diatopic, diachronic and grammatical relationship is given that supports a modular grammar architecture, in which phonetics and phonology constitute two autonomous modules. Within such an architecture, the lateral forces (government and licensing) developed by standard Government Phonology are translated into violable constraints and inserted in a BiPhon grammar. In this optimality-theoretic grammar, the phonetics-phonology interface is managed by a set of cue constraints that map acoustic dimensions (formant structures) onto phonological primitives (elements). Furthermore, to integrate morphological information in the phonological forms, the Coloured Containment Theory is resorted to.Language Use in Past and Presen

    The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages

    Get PDF
    The historical and geographical spread from older to more modern languages has long been studied by examining textual changes and in terms of changes in phonetic transcriptions. However, it is more difficult to analyze language change from an acoustic point of view, although this is usually the dominant mode of transmission. We propose a novel analysis approach for acoustic phonetic data, where the aim will be to statistically model the acoustic properties of spoken words. We explore phonetic variation and change using a time-frequency representation, namely the log-spectrograms of speech recordings. We identify time and frequency covariance functions as a feature of the language; in contrast, mean spectrograms depend mostly on the particular word that has been uttered. We build models for the mean and covariances (taking into account the restrictions placed on the statistical analysis of such objects) and use these to define a phonetic transformation that models how an individual speaker would sound in a different language, allowing the exploration of phonetic differences between languages. Finally, we map back these transformations to the domain of sound recordings, allowing us to listen to the output of the statistical analysis. The proposed approach is demonstrated using recordings of the words corresponding to the numbers from ``one'' to ``ten'' as pronounced by speakers from five different Romance languages.John Coleman appreciates the support of UK Arts and Humanities Research Council grant AH/M002993/1, “Ancient Sounds: mixing acoustic phonetics, statistics and comparative philology to bring speech back from the past”. John Aston appreciates the support of UK Engineering and Physical Sciences Research Council grant EP/K021672/2, “Functional Object Data Analysis and its Applications”
    • …
    corecore