17 research outputs found

    Exploration strategies for articulatory synthesis of complex syllable onsets

    Get PDF
    High-quality articulatory speech synthesis has many potential applications in speech science and technology. However, developing appropriate mappings from linguistic specification to articulatory gestures is difficult and time consuming. In this paper we construct an optimisation-based framework as a first step towards learning these mappings without manual intervention. We demonstrate the production of CCV syllables and discuss the quality of the articulatory gestures with reference to coarticulation

    Imitating conversational laughter with an articulatory speech synthesizer

    Get PDF
    In this study we present initial efforts to model laughter with an articulatory speech synthesizer. We aimed at imitating a real laugh taken from a spontaneous speech database and created several synthetic versions of it using articulatory synthesis and diphone synthesis. In modeling laughter with articulatory synthesis, we also approximated features like breathing noises that do not normally occur in speech. Evaluation with respect to the perceived degree of naturalness indicated that the laugh stimuli would pass as "laughs'; in an appropriate conversational context. In isolation, though, significant differences could be measured with regard to the degree of variation (durational patterning, fundamental frequency, intensity) within each laugh

    Weak biases emerging from vocal tract anatomy shape the repeated transmission of vowels

    No full text
    Linguistic diversity is affected by multiple factors, but it is usually assumed that variation in the anatomy of our speech organs plays no explanatory role. Here we use realistic computer models of the human speech organs to test whether inter-individual and inter-group variation in the shape of the hard palate (the bony roof of the mouth) affects acoustics of speech sounds. Based on 107 midsagittal MRI scans of the hard palate of human participants, we modelled with high accuracy the articulation of a set of five cross-linguistically representative vowels by agents learning to produce speech sounds. We found that different hard palate shapes result in subtle differences in the acoustics and articulatory strategies of the produced vowels, and that these individual-level speech idiosyncrasies are amplified by the repeated transmission of language across generations. Therefore, we suggest that, besides culture and environment, quantitative biological variation can be amplified, also influencing language

    Fast Speech in Unit Selection Speech Synthesis

    Get PDF
    Moers-Prinz D. Fast Speech in Unit Selection Speech Synthesis. Bielefeld: Universität Bielefeld; 2020.Speech synthesis is part of the everyday life of many people with severe visual disabilities. For those who are reliant on assistive speech technology the possibility to choose a fast speaking rate is reported to be essential. But also expressive speech synthesis and other spoken language interfaces may require an integration of fast speech. Architectures like formant or diphone synthesis are able to produce synthetic speech at fast speech rates, but the generated speech does not sound very natural. Unit selection synthesis systems, however, are capable of delivering more natural output. Nevertheless, fast speech has not been adequately implemented into such systems to date. Thus, the goal of the work presented here was to determine an optimal strategy for modeling fast speech in unit selection speech synthesis to provide potential users with a more natural sounding alternative for fast speech output

    Simulating vocal learning of spoken language: Beyond imitation

    Get PDF
    Computational approaches have an important role to play in understanding the complex process of speech acquisition, in general, and have recently been popular in studies of vocal learning in particular. In this article we suggest that two significant problems associated with imitative vocal learning of spoken language, the speaker normalisation and phonological correspondence problems, can be addressed by linguistically grounded auditory perception. In particular, we show how the articulation of consonant-vowel syllables may be learnt from auditory percepts that can represent either individual utterances by speakers with different vocal tract characteristics or ideal phonetic realisations. The result is an optimisation-based implementation of vocal exploration – incorporating semantic, auditory, and articulatory signals – that can serve as a basis for simulating vocal learning beyond imitation

    Die Organisation von Konsonantenclustern und CVC-Sequenzen in zwei portugiesischen Varietäten

    Get PDF
    Wie lässt sich die Tatsache erklären, dass zwischen den Sprechern ein und derselben Sprache so häufig Missverständnisse entstehen können wie im Fall des europäischen und brasilianischen Portugiesisch? Und weshalb ist das Verhältnis asymmetrisch, d.h. wie können Sprecher einer Varietät mehr Schwierigkeiten haben als die der anderen? Das vorliegende Buch analysiert solche variationistischen Gegebenheiten im Rahmen der Artikulatorischen Phonologie mit den Mitteln der modernen Phonetik und zeigt, wie das Zusammenspiel von Produktion und Wahrnehmung die unterschiedlichen Lautmuster der beiden Varietäten bestimmt und die Weltsprache Portugiesisch spaltet

    Let the agents do the talking: On the influence of vocal tract anatomy no speech during ontogeny

    Get PDF
    corecore