20 research outputs found

    Speaking Rate Effects on Locus Equation Slope

    Get PDF
    A locus equation describes a 1st order regression fit to a scatter of vowel steady-state frequency values predicting vowel onset frequency values. Locus equation coefficients are often interpreted as indices of coarticulation. Speaking rate variations with a constant consonant–vowel form are thought to induce changes in the degree of coarticulation. In the current work, the hypothesis that locus slope is a transparent index of coarticulation is examined through the analysis of acoustic samples of large-scale, nearly continuous variations in speaking rate. Following the methodological conventions for locus equation derivation, data pooled across ten vowels yield locus equation slopes that are mostly consistent with the hypothesis that locus equations vary systematically with coarticulation. Comparable analyses between different four-vowel pools reveal variations in the locus slope range and changes in locus slope sensitivity to rate change. Analyses across rate but within vowels are substantially less consistent with the locus hypothesis. Taken together, these findings suggest that the practice of vowel pooling exerts a non-negligible influence on locus outcomes. Results are discussed within the context of articulatory accounts of locus equations and the effects of speaking rate change

    Assessing the adequate treatment of fast speech in unit selection systems for the visually impaired

    Get PDF
    Moers D, Wagner P. Assessing the adequate treatment of fast speech in unit selection systems for the visually impaired. In: Proceedings of the 6th ISCA Tutorial and Research Workshop on Speech Synthesis (SSW-6). 2007: 282-287.This paper describes work in progress concerning the adequate modeling of fast speech in unit selection speech synthesis systems – mostly having in mind blind and visually impaired users. Initially, a survey of the main phonetic characteristics of fast speech will be given. From this, certain conclusions concerning an adequate modeling of fast speech in unit selection synthesis will be drawn. Subsequently, a questionnaire assessing synthetic speech related preferences of visually impaired users will be presented. The last section deals with future experiments aiming at a definition of criteria for the development of synthesis corpora modeling fast speech within the unit selection paradigm

    Combining research methods for an experimental study of West Central Bavarian vowels in adults and children

    Get PDF
    The overall goal of this thesis was to systematically measure defining vowel characteristics of the West Central Bavarian (WCB) dialect for an acoustically based analysis of the Bavarian vowel system and simultaneously investigate to what extent these characteristics are being preserved across generations and if there is a sound change in progress observable in which young speakers show more characteristics of Standard German (SG) than old on some Bavarian vowel attributes. In order to address these aims we conducted acoustic recordings of WCB speaking adults and WCB speaking primary school children which were then compared to each other with an apparent-time analysis. For a more accurate view of changes in progress we combined this apparent-time comparison with longitudinal data from the WCB children, obtained at annually intervals expanding over three years. The acoustic data was enhanced by articulatory data gained from ultrasound recordings of a subset of the same WCB speaking children at two timepoints with one year interval. Analyses of the acoustic data revealed both adult/child and longitudinal changes in the direction of the standard in the children’s tendency towards a merger of two open vowels and a collapse of a long/short consonant contrast, neither of which exist in SG. There was some evidence that children in comparison with adults were beginning to develop both tensity and rounding contrasts which occur in SG but not WCB. There were no observed changes to the pattern of opening and closing diphthongs which differ markedly between the two varieties. Also, within the WCB front vowel that resulted historically from /l/-vocalization and for which articulatory data from a subset of the children was put into relation with the acoustic measures no changes were observed. The general conclusion is that WCB change is most likely to occur as a consequence of exaggerating phonetic variation that already happens to be in the direction of the standard and therefore internal factors motivated by general principles of vowel change might play a more decisive role in inducing a shift than external factors like dialect contact

    Optimization-based modeling of suprasegmental speech timing

    Get PDF
    Windmann A. Optimization-based modeling of suprasegmental speech timing. Bielefeld: Universität Bielefeld; 2016

    Cross-modal reduction: Repetition of words and gestures

    Get PDF
    This dissertation examines speakers’ production of speech and representational gesture. It utilizes the Repetition Effect as the investigative tool. The Repetition Effect appears to vary by the tendency for some items to shorten when repeating, at least under the condition that speakers can primarily operate by their assumption of the state of knowledge of the listener. In speech, a highly conventionalized form of performance, word duration reduces within the same stretch of coherent discourse; then, it resets in the first mention of a new stretch of coherent discourse regardless of the state of knowledge to the speaker or the listener. Therefore, the Repetition Effect in speech is best analyzed as an automatic behavior triggered by discourse structure, rather than reflecting online changes in word accessibility for either interlocutor, be it for the speaker (Listener-neutral explanation) or for the listener (Listener-modeling explanation). The Repetition Effect in speech production in this dissertation will be accounted for within an exemplar model of the perception/production loop. However, in representational gestures, a much less conventionalized form of performance compared to speech, the Repetition Effect shows a different pattern. When speakers only operate by their assumption of the state of knowledge of the listener, without dynamic, appreciable listener feedback, they steadily reduce most types of representational gesture across tellings. Based on these results, it can be argued that representational gestures primarily serve as a part of speech production, rather than as communicative acts. That is, they are produced without regard to the novelty of the information to the listener, thus, consistent with the Listener-neutral explanation

    Dimensions of Segmental Variability: Interaction of Prosody and Surprisal in Six Languages

    Get PDF
    Contextual predictability variation affects phonological and phonetic structure. Reduction and expansion of acoustic-phonetic features is also characteristic of prosodic variability. In this study, we assess the impact of surprisal and prosodic structure on phonetic encoding, both independently of each other and in interaction. We model segmental duration, vowel space size and spectral characteristics of vowels and consonants as a function of surprisal as well as of syllable prominence, phrase boundary, and speech rate. Correlates of phonetic encoding density are extracted from a subset of the BonnTempo corpus for six languages: American English, Czech, Finnish, French, German, and Polish. Surprisal is estimated from segmental n-gram language models trained on large text corpora. Our findings are generally compatible with a weak version of Aylett and Turk's Smooth Signal Redundancy hypothesis, suggesting that prosodic structure mediates between the requirements of efficient communication and the speech signal. However, this mediation is not perfect, as we found evidence for additional, direct effects of changes in surprisal on the phonetic structure of utterances. These effects appear to be stable across different speech rates

    Fast Speech in Unit Selection Speech Synthesis

    Get PDF
    Moers-Prinz D. Fast Speech in Unit Selection Speech Synthesis. Bielefeld: Universität Bielefeld; 2020.Speech synthesis is part of the everyday life of many people with severe visual disabilities. For those who are reliant on assistive speech technology the possibility to choose a fast speaking rate is reported to be essential. But also expressive speech synthesis and other spoken language interfaces may require an integration of fast speech. Architectures like formant or diphone synthesis are able to produce synthetic speech at fast speech rates, but the generated speech does not sound very natural. Unit selection synthesis systems, however, are capable of delivering more natural output. Nevertheless, fast speech has not been adequately implemented into such systems to date. Thus, the goal of the work presented here was to determine an optimal strategy for modeling fast speech in unit selection speech synthesis to provide potential users with a more natural sounding alternative for fast speech output
    corecore