81 research outputs found

    An Investigation of Coarticulation Resistance in Speech Production Using Ultrasound

    Get PDF
    Sound segments show considerable influence from neighbouring segments, which is described as being the result of coarticulation. None of the previous reports on coarticulation in vowel-consonant-vowel (VCV) sequences has used ultrasound. One advantage of ultrasound is that it provides information about the shape of most of the midsagittal tongue contour. In this work, ultrasound is employed for studying symmetrical VCV sequences, like /ipi/ and /ubu/, and methods for analysing coarticulation are refined. The use of electropalatography (EPG) in combination with ultrasound is piloted in the study. A unified approach is achieved to describing lingual behaviour during the interaction of different speech sounds, by using the concept of Coarticulation Resistance, which implies that different sounds resist coarticulatory influence to different degrees. The following research questions were investigated: how does the tongue shape change from one segment to the next in symmetrical VCV sequences? Do the vowels influence the consonant? Does the consonant influence the vowels? Is the vocalic influence on the consonant greater than the consonantal influence on the vowels? What are the differences between lingual and non-lingual consonants with respect to lingual coarticulation? Does the syllable/word boundary affect the coarticulatory pattern? Ultrasound data were collected using the QMUC ultrasound system, and in the final experiment some EPG data were also collected. The data were Russian nonsense VCVs with /i/, /u/, /a/ and bilabial stops; English nonsense VhV sequences with /i/, /u/, /a/; English /aka/, /ata/ and /iti/ sequences, forming part of real speech. The results show a significant vowel influence on all intervocalic consonants. Lingual consonants significantly influence their neighbouring vowels. The vocalic influence on the consonants is significantly greater than the consonantal influence on the vowels. Non-lingual consonants exhibit varying coarticulatory patterns. Syllable and word boundary influence on VCV coarticulation is demonstrated. The results are interpreted and discussed in terms of the Coarticulation Resistance theory: Coarticulation Resistance of speech segments varies, depending on segment type, syllable boundary, and language. A method of quantifying Coarticulation Resistance based on ultrasound data is suggested.sub_shsunpub143_ethesesunpu

    A syllable-based investigation of coarticulation

    Get PDF
    Coarticulation has been long investigated in Speech Sciences and Linguistics (Kühnert & Nolan, 1999). This thesis explores coarticulation through a syllable based model (Y. Xu, 2020). First, it is hypothesised that consonant and vowel are synchronised at the syllable onset for the sake of reducing temporal degrees of freedom, and such synchronisation is the essence of coarticulation. Previous efforts in the examination of CV alignment mainly report onset asynchrony (Gao, 2009; Shaw & Chen, 2019). The first study of this thesis tested the synchrony hypothesis using articulatory and acoustic data in Mandarin. Departing from conventional approaches, a minimal triplet paradigm was applied, in which the CV onsets were determined through the consonant and vowel minimal pairs, respectively. Both articulatory and acoustical results showed that CV articulation started in close temporal proximity, supporting the synchrony hypothesis. The second study extended the research to English and syllables with cluster onsets. By using acoustic data in conjunction with Deep Learning, supporting evidence was found for co-onset, which is in contrast to the widely reported c-center effect (Byrd, 1995). Secondly, the thesis investigated the mechanism that can maximise synchrony – Dimension Specific Sequential Target Approximation (DSSTA), which is highly relevant to what is commonly known as coarticulation resistance (Recasens & Espinosa, 2009). Evidence from the first two studies show that, when conflicts arise due to articulation requirements between CV, the CV gestures can be fulfilled by the same articulator on separate dimensions simultaneously. Last but not least, the final study tested the hypothesis that resyllabification is the result of coarticulation asymmetry between onset and coda consonants. It was found that neural network based models could infer syllable affiliation of consonants, and those inferred resyllabified codas had similar coarticulatory structure with canonical onset consonants. In conclusion, this thesis found that many coarticulation related phenomena, including local vowel to vowel anticipatory coarticulation, coarticulation resistance, and resyllabification, stem from the articulatory mechanism of the syllable

    Asymmetries in English Vowel Perception Mirror Compression Effects

    Get PDF
    A series of vowel-identification experiments using gated consonant stimuli shows that English listeners are capable of recovering the vocalic context in which a consonant appears from information contained in the consonant alone. This is true for most consonants tested, including liquids, nasals, and stops in onset and coda position. Positional asymmetries in vowel sensitivity go in opposite directions for liquids (coda sensitivity \u3e onset) and stops (onset \u3e coda). Nasals pattern with liquids in terms of vowel sensitivity from consonant steady states alone, but pattern more closely with stops when portions outside the steady-state are taken into account. It is argued that these asymmetries are related to patterns of cluster-driven vowel compression (also called ‘compensatory shortening’) in speech production

    Speech Sound Acquisition, Coarticulation, and Rate Effects in a Neural Network Model of Speech Production

    Full text link
    This article describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on contextual variability, motor equivalence, coarticulation, and speaking rate effects. Model parameters are learned during a babbling phase. To explain how infants learn phoneme-specific and language-specific limits on acceptable articulatory variability, the learned speech sound targets take the form of multidimensional convex regions in orosensory coordinates. Reduction of target size for better accuracy during slower speech (in the spirit of the speed-accuracy trade-off described by Fitts' law) leads to differential effects for vowels and consonants, as seen iu speaking rate experiments that have been previously taken as evidence for separate control processes for the two sound types. An account of anticipatory coarticulation is posited wherein the target for a speech sound is reduced in size based on context to provide a more efficient sequence of articulator movements. This explanation generalizes the well-known look ahead model of coarticulation to incorporate convex region targets. Computer simulations verify the model's properties, including linear velocity/distance relationships, motor equivalence, speaking rate effects, and carryover and anticipatory coarticulation.Air Force Office of Scientific Research (F49620-92-J-0499

    Deep learning assessment of syllable affiliation of intervocalic consonants

    Get PDF
    In English, a sentence like “He made out our intentions.” could be misperceived as “He may doubt our intentions.” because the coda /d/ sounds like it has become the onset of the next syllable. The nature and occurrence condition of this resyllabification phenomenon are unclear, however. Previous empirical studies mainly relied on listener judgment, limited acoustic evidence, such as voice onset time, or average formant values to determine the occurrence of resyllabification. This study tested the hypothesis that resyllabification is a coarticulatory reorganisation that realigns the coda consonant with the vowel of the next syllable. Deep learning in conjunction with dynamic time warping (DTW) was used to assess syllable affiliation of intervocalic consonants. The results suggest that convolutional neural network- and recurrent neural network-based models can detect cases of resyllabification using Mel-frequency spectrograms. DTW analysis shows that neural network inferred resyllabified sequences are acoustically more similar to their onset counterparts than their canonical productions. A binary classifier further suggests that, similar to the genuine onsets, the inferred resyllabified coda consonants are coarticulated with the following vowel. These results are interpreted with an account of resyllabification as a speech-rate-dependent coarticulatory reorganisation mechanism in speech

    Emergent consonantal quantity contrast and context-dependence of gestural phasing

    Get PDF
    Embodied Task Dynamics is a modeling platform combining task dynamical implementation of articulatory phonology with an optimization approach based on adjustable trade-offs between production efficiency and perception efficacy. Within this platform we model a consonantal quantity contrast in bilabial stops as emerging from local adjustment of demands on relative prominence of the consonantal gesture conceptualized in terms of closure duration. The contrast is manifested in the form of two distinct, stable inter-gestural coordination patterns characterized by quantitative differences in relative phasing between the consonant and the coproduced vocalic gesture. Furthermore, the model generates a set of qualitative predictions regarding dependence of kinematic characteristics and inter-gestural coordination on consonant quantity and gestural context. To evaluate these predictions, we collected articulatory data for Finnish speakers uttering singletons and geminates in the same context as explored by the model. Statistical analysis of the data shows strong agreement with model predictions. This result provides support for the hypothesis that speech articulation is guided by efficiency principles that underlie many other types of embodied skilled action.Peer reviewe

    The effect of coarticulatory resistance and aerodynamic requirements of consonants on syllable organization in Polish

    Get PDF

    CONTROL AND BIOMECHANICS IN COARTICULATION: INSIGHTS FROM AN ULTRASOUND STUDY OF STANDARD MANDARIN APICAL VOWELS

    Get PDF
    This study investigated the extent to which speaker-induced control and biomechanics play a role in determining the outcome of spatial coarticulation. Employing ultrasound tongue imaging, coarticulatory effects from and induced on adjacent consonants were quantified as measures of coarticulatory resistance and aggressiveness for the two apical vowels of Standard Mandarin in comparison to the three corner vowels. The results show that the two apical vowels are much less resistant to coarticulatory effects than the vowels [i a u], and they often do not induce larger effects on adjacent consonants than these vowels, due to speaker-targeted effects. It was also found that the retroflex apical vowel was consistently more resistant and aggressive than the dental apical vowel, due to biomechanical differences. Together, both of these findings implicate the roles of speaker control and biomechanics in coarticulation and highlight the need for a model of coarticulation to include both of these factors.Master of Art

    Spatial and temporal lingual coarticulation and motor control in preadolescents

    Get PDF
    Purpose: Coarticulation and lingual kinematics were compared in preadolescents and adults, in order to establish whether preadolescents had a greater degree of random variability in tongue posture and whether their patterns of lingual coarticulation differed from those of adults. Method: High-speed ultrasound tongue contour data synchronised with the acoustic signal were recorded from 15 children aged between 10 and 12 years old, and 15 adults. Tongue shape contours were analysed at nine normalised time-points during the fricative phase of schwa-fricative-/a/ and schwa-fricative-/i/ sequences with the consonants /s/ and /ʃ/. Results: There was no significant age-related difference in random variability. Where a significant vowel effect occurred, the amount of coarticulation was similar in the two groups. However, the onset of the coarticulatory effect on preadolescent /ʃ/ was significantly later than on preadolescent /s/, and also later than on adult /s/ and /ʃ/. Conclusions: Preadolescents have adult-like precision of tongue control and adult-like anticipatory lingual coarticulation with respect to spatial characteristics of tongue posture. However, there remains some immaturity in the motor programming of certain complex tongue movements.casl57pub3410pu

    On the role of articulatory prosodies in German message decoding

    Get PDF
    A theoretical framework for speech reduction is outlined in which 'coarticulation' and 'articulatory control' operate on sequences of 'opening-closing gestures' in linguistic and communicative settings, leading to suprasegmental properties - 'articulatory prosodies' - in the acoustic output. In linking this gestalt perspective in speech production to the role of phonetic detail in speech understanding, this paper reports on perception experiments that test listeners' reactions to varying extension of an 'articulatory prosody of palatality' in message identification. The point of departure for the experimental design was the German utterance ich kann Ihnen das ja mal sagen 'I can mention this to you' from the Kiel Corpus of Spontaneous Speech, which contains the palatalized stretch [k̟(h)ε̈n(j)n(j)əs] for the sequence of function words /kan i.n(kə)n das/ kann Ihnen das. The utterance also makes sense without the personal pronoun Ihnen. Systematic experimental variation has shown that the extent of palatality has a highly significant influence on the decoding of Ihnen and that the effect of nasal consonant duration depends on the extension of palatality. These results are discussed in a plea to base future speech perception research on a paradigm that makes the traditional segment-prosody divide more permeable, and moves away from the generally practised phoneme orientation
    corecore