52 research outputs found

    Artimate: an articulatory animation framework for audiovisual speech synthesis

    Get PDF
    We present a modular framework for articulatory animation synthesis using speech motion capture data obtained with electromagnetic articulography (EMA). Adapting a skeletal animation approach, the articulatory motion data is applied to a three-dimensional (3D) model of the vocal tract, creating a portable resource that can be integrated in an audiovisual (AV) speech synthesis platform to provide realistic animation of the tongue and teeth for a virtual character. The framework also provides an interface to articulatory animation synthesis, as well as an example application to illustrate its use with a 3D game engine. We rely on cross-platform, open-source software and open standards to provide a lightweight, accessible, and portable workflow.Comment: Workshop on Innovation and Applications in Speech Technology (2012

    Control concepts for articulatory speech synthesis

    Get PDF
    We present two concepts for the generation of gestural scores to control an articulatory speech synthesizer. Gestural scores are the common input to the synthesizer and constitute an or- ganized pattern of articulatory gestures. The first concept gen- erates the gestures for an utterance using the phonetic transcrip- tions, phone durations, and intonation commands predicted by the Bonn Open Synthesis System (BOSS) from an arbitrary in- put text. This concept extends the synthesizerto a text-to-speech synthesis system. The idea of the second concept is to use tim- ing informationextracted from ElectromagneticArticulography signals to generate the articulatory gestures. Therefore, it is a concept for the re-synthesis of natural utterances. Finally, ap- plication prospects for the presented synthesizer are discussed

    Producing phrasal prominence in German

    Get PDF
    This study examines the relative change in a number of acoustic parameters usually associated with the production of prominences. The production of six German sentences under different question answer conditions provide de-accented and accented versions of the same words in broad and narrow focus. Normalised energy, F0, duration and spectral measures were found to form a stable hierarchy in their exponency of the three degrees of accentuation

    The magnetic resonance imaging subset of the mngu0 articulatory corpus

    Get PDF
    Author version contains correctly encoded (Unicode) fonts and attached multimedia content.International audienceThis paper announces the availability of the magnetic resonance imaging (MRI) subset of the mngu0 corpus, a collection of articulatory speech data from one speaker containing different modalities. This subset comprises volumetric MRI scans of the speaker's vocal tract during sustained production of vowels and consonants, as well as dynamic mid-sagittal scans of repetitive consonant-vowel (CV) syllable production. For reference, high-quality acoustic recordings of the speech material are also available. The raw data are made freely available for research purposes

    Phonetic accommodation to natural and synthetic voices : Behavior of groups and individuals in speech shadowing

    Get PDF
    The present study investigates whether native speakers of German phonetically accommodate to natural and synthetic voices in a shadowing experiment. We aim to determine whether this phenomenon, which is frequently found in HHI, also occurs in HCI involving synthetic speech. The examined features pertain to different phonetic domains: allophonic variation, schwa epenthesis, realization of pitch accents, word-based temporal structure and distribution of spectral energy. On the individual level, we found that the participants converged to varying subsets of the examined features, while they maintained their baseline behavior in other cases or, in rare instances, even diverged from the model voices. This shows that accommodation with respect to one particular feature may not predict the behavior with respect to another feature. On the group level, the participants of the natural condition converged to all features under examination, however very subtly so for schwa epenthesis. The synthetic voices, while partly reducing the strength of effects found for the natural voices, triggered accommodating behavior as well. The predominant pattern for all voice types was convergence during the interaction followed by divergence after the interaction

    SISTEM OTOMASI KENDALI PINTU PERLINTASAN KERETA API BERBASIS SINYAL DTMF (Dual Tone Multi Frequency)

    Get PDF
    Kereta api merupakan salah satu jenis transportasi massal yang diminati oleh masyarakat. Jaringan rel antar kota, terutama di Pulau Jawa sangat mendukung keberadaan kereta api sebagai salah satu moda transportasi yang efektif dan efisien. Kecelakaan lalulintas pada perlintasan rel kereta api sering terjadi akhir-akhir ini. Penyebab terjadinya kecelakaan tersebut umumnya karena tidak adanya pintu perlintasan, atau kegagalan pintu menutup saat dibutuhkan atau kegagalan operator untuk memerintahkan penutupan ( human error). Sistem otomasi kendali pintu perlintasan merupakan solusi dari permasalahan tersebut. Pintu perlintasan Kereta Api yang dapat membuka dan menutup dengan kontrol/ monitoring dari stasiun merupakan metode termurah dan handal dapat meningkatkan jaminan keselamatan dan keamanan para pengguna jalan. Oleh karena itu, rancang bangun suatu sistem otomasi pintu perlintasan Kereta Api menjadi sangat signifikan. Pada rancang bangun sistem otomasi kendali ini digunakan mikrokontroler ATmega8535 sebagai komponen pengontrol utama. Telepon dan Dekoder DTMF sebagai pendeteksi kedatangan kereta api dan sensor optocopler sebagai pendeteksi kereta telah lewat. Untuk membuka dan menutup pintu digunakan Motor DC Servo sesuai dengan perintah mikrokontroler. Rangkaian sistem otomasi pintu lintasan rel kereta api ini secara umum dibagi menjadi dua bagian, yaitu hardware dan software. Hardware merupakan sistem perangkat keras yang meliputi sistem kontrol berupa Mikrokontroller, motor, speaker, lampu sirine dan sensor, sedangkan software berupa program yang menggunakan pemrograman bahasa BASIC. Dalam pengerjaannya bagi tiga tahap yaitu tahapan perancangan , pembuatan dan uji coba alat. Uji coba alat dibagi menjadi dua tahap, yaitu uji coba per bagian (part) dan uji coba secara keseluruhan. Hasil dari rancang bangun ini diperoleh suatu alat kendali pintu perlintasan KA dengan menggunakan Sinyal telepon dari stasiun untuk mendeteksi kedatangan kereta, mikrokontroler ATmega8535 sebagai pemroses input, dan DC motor servo sebagai penggerak pintu untuk membuka dan menutup. Dengan adanya alat ini diharapkan dapat meminimalisasikan kecelakaan yang terjadi pada perlintasan rel kereta api, sehingga dapat meningkatkan jaminan keselamatan dan keamanan bagi para penumpang kereta api maupun para pengguna jalan

    Observations on the dynamic control of an articulatory synthesizer using speech production data

    Get PDF
    This dissertation explores the automatic generation of gestural score based control structures for a three-dimensional articulatory speech synthesizer. The gestural scores are optimized in an articulatory resynthesis paradigm using a dynamic programming algorithm and a cost function which measures the deviation from a gold standard in the form of natural speech production data. This data had been recorded using electromagnetic articulography, from the same speaker to which the synthesizer\u27s vocal tract model had previously been adapted. Future work to create an English voice for the synthesizer and integrate it into a text-to-speech platform is outlined.Die vorliegende Dissertation untersucht die automatische Erzeugung von gesturalpartiturbasierten Steuerdaten fĂŒr ein dreidimensionales artikulatorisches Sprachsynthesesystem. Die gesturalen Partituren werden in einem artikulatorischen Resynthese-Paradigma mittels dynamischer Programmierung optimiert, unter Zuhilfenahme einer Kostenfunktion, die den Abstand zu einem "Gold Standard" in Form natĂŒrlicher Sprachproduktionsdaten mißt. Diese Daten waren mit elektromagnetischer Artikulographie am selben Sprecher aufgenommen worden, an den zuvor das Vokaltraktmodell des Synthesesystems angepaßt worden war. WeiterfĂŒhrende Forschung, eine englische Stimme fĂŒr das Synthesesystem zu erzeugen und sie in eine Text-to-Speech-Plattform einzubetten, wird umrissen
    • 

    corecore