39 research outputs found

    Modeling Durational Incompressibility

    Get PDF
    Windmann A, Simko J, Wrede B, Wagner P. Modeling Durational Incompressibility. In: Bimbot F, ed. Speech in Life Sciences and Human Societies. Vol. 2. Red Hook, NY: Curran; 2014: 1375-1379

    Speech Communication

    Get PDF
    Contains research objectives, summary of research and reports on three research projects.U. S. Navy - Office of Naval Research (Contract N00014-67-A-0204-0064)U. S. Navy - Office of Naval Research (Contract N00014-67-A-0204-0069)National Science Foundation (Grant GK-31353)National Institutes of Health (Grant 5 RO1 NS04332-10)Joint Services Electronics Programs (U. S. Army, U. S. Navy, and U. S. Air Force) under Contract DAAB07-71-C-0300Bell Telephone Laboratories Fellowshi

    Optimization-based modeling of suprasegmental speech timing

    Get PDF
    Windmann A. Optimization-based modeling of suprasegmental speech timing. Bielefeld: Universität Bielefeld; 2016

    The invalidity of rhythm class hypothesis

    Get PDF
    Languages are said to be stress-timed, syllable-timed or mora-timed. In a stress-timed language, inter-stress intervals are or tend to be constant, hence, isochronous, while in a syllable-timed or mora-timed language, successive syllables or morae are or tent to be equal in duration. Empirical research has failed to find evidence of isochrony in any language, yet the hypothesis is now sustained by perception accounts or phonetic metrics that do not measure isochrony. We have re-examined the rhythm class hypothesis by looking for evidence of at least a tendency toward isochrony, through a comparison of English, an alleged stress-timed language, and Mandarin, an alleged syllable-timed language. The results show that in English, segments are not compressible to allow equal syllable duration, and syllables are incompressible to enable equal inter-stress interval duration and phrase duration. In contrast, Mandarin shows a small tendency toward both equal syllable duration and equal phrase duration. These findings are exactly the opposite of what would be predicted by the rhythm class hypothesis. We therefore argue that the hypothesis is not just flawed, but simply untenable, and the so-called rhythm classes should no longer be held as a basic fact of human language

    A Unified Account of Prominence Effects in an Optimization-Based Model of Speech Timing

    Get PDF
    Windmann A, Simko J, Wagner P. A Unified Account of Prominence Effects in an Optimization-Based Model of Speech Timing. In: Proceedings of Interspeech 2014. 2014: 159-163

    Probing Theories of Speech Timing using Optimization Modeling

    Get PDF
    Windmann A, Simko J, Wagner P. Probing Theories of Speech Timing using Optimization Modeling. In: Proceedings of Speech Prosody 7. Dublin, Ireland; 2014: 346-350.We implement two theories about the temporal organization of speech in an optimization-based model of speech timing and conduct simulation experiments in order to test whether both theories can account for the phenomenon of foot-level shortening (FLS) observed in English speech corpora. Results suggest that a model that induces compensatory timing relations between syllables and feet predicts empirical results very accurately. However, we also observe that the FLS effect can equally well be explained under the assumption that suprasegmental timing is confined to localized lengthening effects at the heads and edges of prosodic domains. Implications for theories of speech timing are discussed

    Fast Speech in Unit Selection Speech Synthesis

    Get PDF
    Moers-Prinz D. Fast Speech in Unit Selection Speech Synthesis. Bielefeld: Universität Bielefeld; 2020.Speech synthesis is part of the everyday life of many people with severe visual disabilities. For those who are reliant on assistive speech technology the possibility to choose a fast speaking rate is reported to be essential. But also expressive speech synthesis and other spoken language interfaces may require an integration of fast speech. Architectures like formant or diphone synthesis are able to produce synthetic speech at fast speech rates, but the generated speech does not sound very natural. Unit selection synthesis systems, however, are capable of delivering more natural output. Nevertheless, fast speech has not been adequately implemented into such systems to date. Thus, the goal of the work presented here was to determine an optimal strategy for modeling fast speech in unit selection speech synthesis to provide potential users with a more natural sounding alternative for fast speech output

    Interaction between Phrasal Structure and Vowel Tenseness in German: An Acoustic and Articulatory Study

    Get PDF
    Phrase-final lengthening affects the segments preceding a prosodic boundary. This prosodic variation is generally assumed to be independent of the phonemic identity. We refer to this as the ‘uniform lengthening hypothesis’ (ULH). However, in German, lax vowels do not undergo lengthening for word stress or shortening for increased speech rate, indicating that temporal properties might interact with phonemic identity. We test the ULH by comparing the effect of the boundary on acoustic and kinematic measures for tense and lax vowels and several coda consonants. We further examine if the boundary effect decreases with distance from the boundary. Ten native speakers of German were recorded by means of electromagnetic articulography (EMA) while reading sentences that contained six minimal pairs varying in vowel tenseness and boundary type. In line with the ULH, the results show that the acoustic durations of lax vowels are lengthened phrase-finally, similarly to tense vowels. We find that acoustic lengthening is stronger the closer the segments are to the boundary. Articulatory parameters of the closing movements toward the post-vocalic consonants are affected by both phrasal position and identity of the preceding vowel. The results are discussed with regard to the interaction between prosodic structure and vowel tenseness.Peer Reviewe

    Communicative function and prosodic form in speech timing

    Get PDF
    Listeners can use variation in speech segment duration to interpret the structure of spoken utterances, but there is no systematic description of how speakers manipulate timing for communicative ends. Here I propose a functional approach to prosodic speech timing, with particular reference to English. The disparate findings regarding the production of timing effects are evaluated against the functional requirement that communicative durational variation should be perceivable and interpretable by the listener. In the resulting framework, prosodic structure is held to influence speech timing directly only at the heads and edges of prosodic domains, through large, consistent lengthening effects. As each such effect has a characteristic locus within its domain, speech timing cues are potentially disambiguated for the listener, even in the absence of other information. Diffuse timing effects – in particular, quasi-rhythmical compensatory processes implying a relationship between structure and timing throughout the utterance – are found to be weak and inconsistently observed. Furthermore, it is argued that articulatory and perceptual constraints make shortening processes less useful as structural cues, and they must be regarded as peripheral, at best, in a parsimonious and functionally-informed account
    corecore