904 research outputs found

    Frication and Voicing Classification

    Full text link

    The weight of phonetic substance in the structure of sound inventories

    Get PDF
    In the research field initiated by Lindblom & Liljencrants in 1972, we illustrate the possibility of giving substance to phonology, predicting the structure of phonological systems with nonphonological principles, be they listener-oriented (perceptual contrast and stability) or speaker-oriented (articulatory contrast and economy). We proposed for vowel systems the Dispersion-Focalisation Theory (Schwartz et al., 1997b). With the DFT, we can predict vowel systems using two competing perceptual constraints weighted with two parameters, respectively λ and α. The first one aims at increasing auditory distances between vowel spectra (dispersion), the second one aims at increasing the perceptual salience of each spectrum through formant proximities (focalisation). We also introduced new variants based on research in physics - namely, phase space (λ,α) and polymorphism of a given phase, or superstructures in phonological organisations (VallĂ©e et al., 1999) which allow us to generate 85.6% of 342 UPSID systems from 3- to 7-vowel qualities. No similar theory for consonants seems to exist yet. Therefore we present in detail a typology of consonants, and then suggest ways to explain plosive vs. fricative and voiceless vs. voiced consonants predominances by i) comparing them with language acquisition data at the babbling stage and looking at the capacity to acquire relatively different linguistic systems in relation with the main degrees of freedom of the articulators; ii) showing that the places “preferred” for each manner are at least partly conditioned by the morphological constraints that facilitate or complicate, make possible or impossible the needed articulatory gestures, e.g. the complexity of the articulatory control for voicing and the aerodynamics of fricatives. A rather strict coordination between the glottis and the oral constriction is needed to produce acceptable voiced fricatives (Mawass et al., 2000). We determine that the region where the combinations of Ag (glottal area) and Ac (constriction area) values results in a balance between the voice and noise components is indeed very narrow. We thus demonstrate that some of the main tendencies in the phonological vowel and consonant structures of the world’s languages can be explained partly by sensorimotor constraints, and argue that actually phonology can take part in a theory of Perception-for-Action-Control

    Acoustic characteristics of English fricatives

    Get PDF
    This is the publisher's version, also available electronically from http://scitation.aip.org/content/asa/journal/jasa/108/3/10.1121/1.1288413.This study constitutes a large-scale comparative analysis of acoustic cues for classification of place of articulation in fricatives. To date, no single metric has been found to classify fricative place of articulation with a high degree of accuracy. This study presents spectral, amplitudinal, and temporal measurements that involve both static properties(spectral peak location, spectral moments, noise duration, normalized amplitude, and F2 onset frequency) and dynamic properties (relative amplitude and locus equations). While all cues (except locus equations) consistently serve to distinguish sibilant from nonsibilant fricatives, the present results indicate that spectral peak location, spectral moments, and both normalized and relative amplitude serve to distinguish all four places of fricative articulation. These findings suggest that these static and dynamic acoustic properties can provide robust and unique information about all four places of articulation, despite variation in speaker, vowel context, and voicing

    What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations

    Get PDF
    This is the author's accepted manuscript. This article may not exactly replicate the final version published in the APA journal. It is not the copy of record. The original publication is available at http://psycnet.apa.org/index.cfm?fa=search.displayrecord&uid=2011-05323-001.Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important are the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context dependent. This study assessed the informational assumptions of several models of speech categorization, in particular, the number of cues that are the basis of categorization and whether these cues represent the input veridically or have undergone compensation. We collected a corpus of 2,880 fricative productions (Jongman, Wayland, & Wong, 2000) spanning many talker and vowel contexts and measured 24 cues for each. A subset was also presented to listeners in an 8AFC phoneme categorization task. We then trained a common classification model based on logistic regression to categorize the fricative from the cue values and manipulated the information in the training set to contrast (a) models based on a small number of invariant cues, (b) models using all cues without compensation, and (c) models in which cues underwent compensation for contextual factors. Compensation was modeled by computing cues relative to expectations (C-CuRE), a new approach to compensation that preserves fine-grained detail in the signal. Only the compensation model achieved a similar accuracy to listeners and showed the same effects of context. Thus, even simple categorization metrics can overcome the variability in speech when sufficient information is available and compensation schemes like C-CuRE are employed

    The Effect of Instructed Second Language Learning on the Acoustic Properties of First Language Speech

    Get PDF
    This paper reports on a comprehensive phonetic study of American classroom learners of Russian, investigating the influence of the second language (L2) on the first language (L1). Russian and English productions of 20 learners were compared to 18 English monolingual controls focusing on the acoustics of word-initial and word-final voicing. The results demonstrate that learners’ Russian was acoustically different from their English, with shorter voice onset times (VOTs) in [−voice] stops, longer prevoicing in [+voice] stops, more [−voice] stops with short lag VOTs and more [+voice] stops with prevoicing, indicating a degree of successful L2 pronunciation learning. Crucially, learners also demonstrated an L1 phonetic change compared to monolingual English speakers. Specifically, the VOT of learners’ initial English voiceless stops was shortened, indicating assimilation with Russian, while the frequency of prevoicing in learners’ English was decreased, indicating dissimilation with Russian. Word-final, the duration of preceding vowels, stop closures, frication, and voicing during consonantal constriction all demonstrated drift towards Russian norms of word-final voicing neutralization. The study confirms that L2-driven phonetic changes in L1 are possible even in L1-immersed classroom language learners, challenging the role of reduced L1 use and highlighting the plasticity of the L1 phonetic system

    Perceptual Integration of Acoustic Cues to Laryngeal Contrasts in Korean Fricatives

    Get PDF
    This paper provides evidence that multiple acoustic cues involving the presence of lowfrequency energy integrate in the perception of Korean coronal fricatives. This finding helps explain a surprising asymmetry between the production and perception of these fricatives found in previous studies: lower F0 onset in the following vowel leads to a response bias for plain [s] over fortis [s*], despite the fact that there is no evidence for a corresponding acoustic asymmetry in the production of [s] and [s*]. A fixed classification task using the Garner paradigm provides evidence that low F0 in a following vowel and the presence of voicing during frication perceptually integrate. This suggests that Korean listeners in previous experiments were responding to an “intermediate perceptual property” of stimuli, despite the fact that the individual acoustic components of that property are not all present in typical Korean fricative productions. The finding also broadens empirical support for the general idea of perceptual integration to a language, a different manner of consonant, and a situation where covariance of the acoustic cues under investigation is not generally present in a listener’s linguistic input

    Contingent categorization in speech perception

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in Language Cognition and Neuroscience in 2014, available online: http://www.tandfonline.com/10.1080/01690965.2013.824995.The speech signal is notoriously variable, with the same phoneme realized differently depending on factors like talker and phonetic context. Variance in the speech signal has led to a proliferation of theories of how listeners recognize speech. A promising approach, supported by computational modeling studies, is contingent categorization, wherein incoming acoustic cues are computed relative to expectations. We tested contingent encoding empirically. Listeners were asked to categorize fricatives in CV syllables constructed by splicing the fricative from one CV syllable with the vowel from another CV syllable. The two spliced syllables always contained the same fricative, providing consistent bottom-up cues; however on some trials, the vowel and/or talker mismatched between these syllables, giving conflicting contextual information. Listeners were less accurate and slower at identifying the fricatives in mismatching splices. This suggests that listeners rely on context information beyond bottom-up acoustic cues during speech perception, providing support for contingent categorization

    The production and perception of coronal fricatives in Seoul Korean: The case for a fourth laryngeal category

    Get PDF
    This article presents new data on the contrast between the two voiceless coronal fricatives of Korean, variously described as a lenis/fortis or aspirated/fortis contrast. In utterance-initial position, the fricatives were found to differ in centroid frequency; duration of frication, aspiration, and the following vowel; and several aspects of the following vowel onset, including intensity profile, spectral tilt, and F1 onset. The between-fricative differences varied across vowel contexts, however, and spectral differences in the vowel onset especially were more pronounced for /a/ than for /i, ÉŻ, u/. This disparity led to the hypothesis that cues in the following vowel onset would exert a weaker influence on perception for high vowels than for low vowels. Perception data provided general support for this hypothesis, indicating that while vowel onset cues had the largest impact on perception for both high- and low-vowel stimuli, this influence was weaker for high vowels. Perception was also strongly influenced by aspiration duration, with modest contributions from frication duration and f0 onset. Taken together, these findings suggest that the 'non-fortis' fricative is best characterized not in terms of the lenis or aspirated categories for stops, but in terms of a unique representation that is both lenis and aspirated

    Learning [Voice]

    Get PDF
    The [voice] distinction between homorganic stops and fricatives is made by a number of acoustic correlates including voicing, segment duration, and preceding vowel duration. The present work looks at [voice] from a number of multidimensional perspectives. This dissertation\u27s focus is a corpus study of the phonetic realization of [voice] in two English-learning infants aged 1;1--3;5. While preceding vowel duration has been studied before in infants, the other correlates of post-vocalic voicing investigated here --- preceding F1, consonant duration, and closure voicing intensity --- had not been measured before in infant speech. The study makes empirical contributions regarding the development of the production of [voice] in infants, not just from a surface-level perspective but also with implications for the phonetics-phonology interface in the adult and developing linguistic systems. Additionally, several methodological contributions will be made in the use of large sized corpora and data modeling techniques. The study revealed that even in infants, F1 at the midpoint of a vowel preceding a voiced consonant was lower by roughly 50 Hz compared to a vowel before a voiceless consonant, which is in line with the effect found in adults. But while the effect has been considered most likely to be a physiological and nonlinguistic phenomenon in adults, it actually appeared to be correlated in the wrong direction with other aspects of [voice] here, casting doubt on a physiological explanation. Some of the consonant pairs had statistically significant differences in duration and closure voicing. Additionally, a preceding vowel duration difference was found and as well a preliminary indication of a developmental trend that suggests the preceding vowel duration difference is being learned. The phonetics of adult speech is also considered. Results are presented from a dialectal corpus study of North American English and a lab speech experiment which clarifies the relationship between preceding vowel duration and flapping and the relationship between [voice] and F1 in preceding vowels. Fluent adult speech is also described and machine learning algorithms are applied to learning the [voice] distinction using multidimensional acoustic input plus some lexical knowledge

    The Phonetics and Phonology of Nyagrong Minyag, an Endangered Language of Western China.

    Get PDF
    Ph.D. Thesis. University of Hawaiʻi at Mānoa 2018
    • 

    corecore