860 research outputs found

    Effects of Palatal Expansion on Speech Production

    Get PDF
    Introduction: Rapid palatal expanders (RPEs) are a commonly used orthodontic adjunct for the treatment of posterior crossbites. RPEs are cemented to bilateral posterior teeth across the palate and thus may interfere with proper tongue movement and linguopalatal contact. The purpose of this study was to identify what specific role RPEs have on speech sound production for the child and early adolescent orthodontic patient. Materials and Methods: RPEs were treatment planned for patients seeking orthodontics at Marquette University. Speech recordings were made using a phonetically balanced reading passage (“The Caterpillar”) at 3 time points: 1) before RPE placement; 2) immediately after cementation; and 3) 10-14 days post appliance delivery. Measures of vocal tract resonance (formant center frequencies) were obtained for vowels and measures of noise distribution (spectral moments) were obtained for consonants. Two-way repeated measures (ANOVA) was used along with post-hoc tests for statistical analysis. Results: For the vowel /i/, the first formant increased and the second formant decreased indicating a more inferior and posterior tongue position. For /e/, only the second formant decreased resulting in a more posterior tongue position. The formants did not return to baseline within the two-week study period. For the fricatives /s/, //, /t/, and /k/, a significant shift from high to low frequencies indicated distortion upon appliance placement. Of these, only /t/ fully returned to baseline during the study period. Conclusion: Numerous phonemes were distorted upon RPE placement which indicated altered speech sound production. For most phonemes, it takes longer than two weeks for speech to return to baseline, if at all. Clinically, the results of this study will help with pre-treatment and interdisciplinary counseling for orthodontic patients receiving palatal expanders

    Consonantal F0 perturbation in American English involves multiple mechanisms.

    Get PDF
    In this study, we revisit consonantal perturbation of F0 in English, taking into particular consideration the effect of alignment of F0 contours to segments and the F0 extraction method in the acoustic analysis. We recorded words differing in consonant voicing, manner of articulation, and position in syllable, spoken by native speakers of American English in both statements and questions. In the analysis, we compared methods of F0 alignment and found that the highest F0 consistency occurred when F0 contours were time-normalized to the entire syllable. Applying this method, along with using syllables with nasal consonants as the baseline and a fine-detailed F0 extraction procedure, we identified three distinct consonantal effects: a large but brief (10-40 ms) F0 raising at voice onset regardless of consonant voicing, a smaller but longer-lasting F0 raising effect by voiceless consonants throughout a large proportion of the following vowels, and a small lowering effect of around 6 Hz by voiced consonants, which was not found in previous studies. Additionally, a brief anticipatory effect was observed before a coda consonant. These effects are imposed on a continuously changing F0 curve that is either rising-falling or falling-rising, depending on whether the carrier sentence is a statement or a question

    Is fundamental frequency a cue to aspiration in initial stops?

    Get PDF
    One production and one perception experiment were conducted to investigate the interaction of consonant voicing and fundamental frequency at the onset of voicing (onset f0) in Cantonese, a tonal language. Consonantal voicing in English can affect onset f0 up to 100 ms after voicing onset, but existing research provides inconclusive information regarding the effects of voicing on f0 in tonal languages where f0 variability is constrained by the demands of the lexical tone system. Previous research on consonantal effects on onset f0 provides two contrasting theories: These effects may be automatic, resulting from physiological constraints inherent to the speech production mechanism or they may be controlled, produced as part of a process of cue enhancement for the perception of laryngeal contrasts. Results of experiment 1 showed that consonant aspiration affects onset f0 in Cantonese only within the first 10 ms following voicing onset, comparable to results for other tonal languages. Experiment 2 showed that Cantonese listeners can use differences in onset f0 to cue perception of the voicing contrast, but the minimum extent of f0 perturbation necessary for this is greater than is found in Cantonese production, and comparable to that observed in acoustic studies of nontonal languages. These results suggest that consonantal effects on onset f0 are at least partially controlled by talkers, but that their role in the perception of voicing/aspiration may be a consequence of language independent properties of audition rather than listeners' experience with the phonological contrasts of a specific language.published_or_final_versio

    Immediate and Distracted Imitation in Second-Language Speech: Unreleased Plosives in English

    Get PDF
    The paper investigates immediate and distracted imitation in second-language speech using unreleased plosives. Unreleased plosives are fairly frequently found in English sequences of two stops. Polish, on the other hand, is characterised by a significant rate of releases in such sequences. This cross-linguistic difference served as material to look into how and to what extent non-native properties of sounds can be produced in immediate and distracted imitation. Thirteen native speakers of Polish first read and then imitated sequences of words with two stops straddling the word boundary. Stimuli for imitation had no release of the first stop. The results revealed that (1) a non-native feature such as the lack of the release burst can be imitated; (2) distracting imitation impedes imitative performance; (3) the type of a sequence interacts with the magnitude of an imitative effec

    The Vietnamese Vowel System

    Get PDF
    In this dissertation, I provide a new analysis of the Vietnamese vowel system as a system with fourteen monophthongs and nineteen diphthongs based on phonetic and phonological data. I propose that these Vietnamese contour vowels - /ie/, /ɯɤ/ and /uo/ - should be grouped with these eleven monophthongs /i e ɛ a ɐ ʌ ɤ ɯ u o ɔ/ based on their similarities in phonetic and phonological behaviors. The phonetic characteristics of these vowels are studied acoustically using normalized and scaled acoustic values of 13,925 tokens, spoken by female Hanoian speakers from my speech corpus, The Vietnamese Speech Corpus . Phonetic analysis shows that the eleven monophthongs and three contour vowels are similar in terms of formant frequency targets, formant dynamic trajectories, and duration. Phonologically, monophthongs and contour vowel can be rhymed with each other in poems, and the two elements within each contour vowel should be analyzed as two halves of one root node in the syllable structure. In chapters 1 and 2, I give the current analysis of the Vietnamese sound system, review different approaches to the acoustic features of vowels, and the phonemic status of diphthongs. In chapter 3, I give a detailed description of the Vietnamese Speech Corpus. In chapter 4, I show the difference in formant targets between monophthongs and glides, as well as the importance of duration in distinguishing vowels in Vietnamese. I also give evidence for the differences in duration between the diphthongs and the monophthongs-and-contour-vowels group. In chapter 5, I analyze the natural class of monophthongs and contour vowels in terms of feature geometry and give evidence from Vietnamese phonological processes to support the analysis of contour vowels as being in the same natural class as monophthongs. I also re-analyze Vietnamese triphthongs as diphthongs in this chapter. Finally, in chapter 6, I summarize the similarities and differences across the monophthongs, contour vowels, and diphthongs, and suggest possible future studies to test this hypothesis of the Vietnamese vowel system

    Computer classification of stop consonants in a speaker independent continuous speech environment

    Get PDF
    In the English language there are six stop consonants, /b,d,g,p,t,k/. They account for over 17% of all phonemic occurrences. In continuous speech, phonetic recognition of stop consonants requires the ability to explicitly characterize the acoustic signal. Prior work has shown that high classification accuracy of discrete syllables and words can be achieved by characterizing the shape of the spectrally transformed acoustic signal. This thesis extends this concept to include a multispeaker continuous speech database and statistical moments of a distribution to characterize shape. A multivariate maximum likelihood classifier was used to discriminate classes. To reduce the number of features used by the discriminant model a dynamic programming scheme was employed to optimize subset combinations. The top six moments were the mean, variance, and skewness in both frequency and energy. Results showed 85% classification on the full database of 952 utterances. Performance improved to 97% when the discriminant model was trained separately for male and female talkers

    On The Way To Linguistic Representation: Neuromagnetic Evidence of Early Auditory Abstraction in the Perception of Speech and Pitch

    Get PDF
    The goal of this dissertation is to show that even at the earliest (non-invasive) recordable stages of auditory cortical processing, we find evidence that cortex is calculating abstract representations from the acoustic signal. Looking across two distinct domains (inferential pitch perception and vowel normalization), I present evidence demonstrating that the M100, an automatic evoked neuromagnetic component that localizes to primary auditory cortex is sensitive to abstract computations. The M100 typically responds to physical properties of the stimulus in auditory and speech perception and integrates only over the first 25 to 40 ms of stimulus onset, providing a reliable dependent measure that allows us to tap into early stages of auditory cortical processing. In Chapter 2, I briefly present the episodicist position on speech perception and discuss research indicating that the strongest episodicist position is untenable. I then review findings from the mismatch negativity literature, where proposals have been made that the MMN allows access into linguistic representations supported by auditory cortex. Finally, I conclude the Chapter with a discussion of the previous findings on the M100/N1. In Chapter 3, I present neuromagnetic data showing that the re-sponse properties of the M100 are sensitive to the missing fundamental component using well-controlled stimuli. These findings suggest that listeners are reconstructing the inferred pitch by 100 ms after stimulus onset. In Chapter 4, I propose a novel formant ratio algorithm in which the third formant (F3) is the normalizing factor. The goal of formant ratio proposals is to provide an explicit algorithm that successfully "eliminates" speaker-dependent acoustic variation of auditory vowel tokens. Results from two MEG experiments suggest that auditory cortex is sensitive to formant ratios and that the perceptual system shows heightened sensitivity to tokens located in more densely populated regions of the vowel space. In Chapter 5, I report MEG results that suggest early auditory cortical processing is sensitive to violations of a phonological constraint on sound sequencing, suggesting that listeners make highly specific, knowledge-based predictions about rather abstract anticipated properties of the upcoming speech signal and violations of these predictions are evident in early cortical processing

    What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations

    Get PDF
    This is the author's accepted manuscript. This article may not exactly replicate the final version published in the APA journal. It is not the copy of record. The original publication is available at http://psycnet.apa.org/index.cfm?fa=search.displayrecord&uid=2011-05323-001.Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important are the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context dependent. This study assessed the informational assumptions of several models of speech categorization, in particular, the number of cues that are the basis of categorization and whether these cues represent the input veridically or have undergone compensation. We collected a corpus of 2,880 fricative productions (Jongman, Wayland, & Wong, 2000) spanning many talker and vowel contexts and measured 24 cues for each. A subset was also presented to listeners in an 8AFC phoneme categorization task. We then trained a common classification model based on logistic regression to categorize the fricative from the cue values and manipulated the information in the training set to contrast (a) models based on a small number of invariant cues, (b) models using all cues without compensation, and (c) models in which cues underwent compensation for contextual factors. Compensation was modeled by computing cues relative to expectations (C-CuRE), a new approach to compensation that preserves fine-grained detail in the signal. Only the compensation model achieved a similar accuracy to listeners and showed the same effects of context. Thus, even simple categorization metrics can overcome the variability in speech when sufficient information is available and compensation schemes like C-CuRE are employed
    corecore