1,321 research outputs found

    Coalescent Assimilation Across Wordboundaries in American English and in Polish English

    Get PDF
    Coalescent assimilation (CA), where alveolar obstruents /t, d, s, z/ in word-final position merge with word-initial /j/ to produce postalveolar /tʃ, dʒ, ʃ, ʒ/, is one of the most wellknown connected speech processes in English. Due to its commonness, CA has been discussed in numerous textbook descriptions of English pronunciation, and yet, upon comparing them it is difficult to get a clear picture of what factors make its application likely. This paper aims to investigate the application of CA in American English to see a) what factors increase the likelihood of its application for each of the four alveolar obstruents, and b) what is the allophonic realization of plosives /t, d/ if the CA does not apply. To do so, the Buckeye Corpus (Pitt et al. 2007) of spoken American English is analyzed quantitatively. As a second step, these results are compared with Polish English; statistics analogous to the ones listed above for American English are gathered for Polish English based on the PLEC corpus (Pęzik 2012). The last section focuses on what consequences for teaching based on a native speaker model the findings have. It is argued that a description of the phenomenon that reflects the behavior of speakers of American English more accurately than extant textbook accounts could be beneficial to the acquisition of these patterns

    Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audiovisual and auditory speech perception

    Get PDF
    Auditory and audio-visual speech perception was investigated using auditory signals of invariant spectral envelope that temporally encoded the presence of voiced and voiceless excitation, variations in amplitude envelope and F-0. In experiment 1, the contribution of the timing of voicing was compared in consonant identification to the additional effects of variations in F-0 and the amplitude of voiced speech. In audio-visual conditions only, amplitude variation slightly increased accuracy globally and for manner features. F-0 variation slightly increased overall accuracy and manner perception in auditory and audio-visual conditions. Experiment 2 examined consonant information derived from the presence and amplitude variation of voiceless speech in addition to that from voicing, F-0, and voiced speech amplitude. Binary indication of voiceless excitation improved accuracy overall and for voicing and manner. The amplitude variation of voiceless speech produced only a small increment in place of articulation scores. A final experiment examined audio-visual sentence perception using encodings of voiceless excitation and amplitude variation added to a signal representing voicing and F-0. There was a contribution of amplitude variation to sentence perception, but not of voiceless excitation. The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity. (C) 1999 Acoustical Society of America. [S0001-4966(99)01410-1]

    Characterization of Arabic sibilant consonants

    Get PDF
    The aim of this study is to develop an automatic speech recognition system in order to classify sibilant Arabic consonants into two groups: alveolar consonants and post-alveolar consonants. The proposed method is based on the use of the energy distribution, in a consonant-vowel type syllable, as an acoustic cue. The application of this method on our own corpus reveals that the amount of energy included in a vocal signal is a very important parameter in the characterization of Arabic sibilant consonants. For consonants classifications, the accuracy achieved to identify consonants as alveolar or post-alveolar is 100%. For post-alveolar consonants, the rate is 96% and for alveolar consonants, the rate is over 94%. Our classification technique outperformed existing algorithms based on support vector machines and neural networks in terms of classification rate

    Neural Dynamics of Phonetic Trading Relations for Variable-Rate CV Syllables

    Full text link
    The perception of CV syllables exhibits a trading relationship between voice onset time (VOT) of a consonant and duration of a vowel. Percepts of [ba] and [wa] can, for example, depend on the durations of the consonant and vowel segments, with an increase in the duration of the subsequent vowel switching the percept of the preceding consonant from [w] to [b]. A neural model, called PHONET, is proposed to account for these findings. In the model, C and V inputs are filtered by parallel auditory streams that respond preferentially to transient and sustained properties of the acoustic signal, as in vision. These streams are represented by working memories that adjust their processing rates to cope with variable acoustic input rates. More rapid transient inputs can cause greater activation of the transient stream which, in turn, can automatically gain control the processing rate in the sustained stream. An invariant percept obtains when the relative activations of C and V representations in the two streams remain uncha.nged. The trading relation may be simulated as a result of how different experimental manipulations affect this ratio. It is suggested that the brain can use duration of a subsequent vowel to make the [b]/[w] distinction because the speech code is a resonant event that emerges between working mernory activation patterns and the nodes that categorize them.Advanced Research Projects Agency (90-0083); Air Force Office of Scientific Reseearch (F19620-92-J-0225); Pacific Sierra Research Corporation (91-6075-2
    • …
    corecore