52 research outputs found
Artimate: an articulatory animation framework for audiovisual speech synthesis
We present a modular framework for articulatory animation synthesis using
speech motion capture data obtained with electromagnetic articulography (EMA).
Adapting a skeletal animation approach, the articulatory motion data is applied
to a three-dimensional (3D) model of the vocal tract, creating a portable
resource that can be integrated in an audiovisual (AV) speech synthesis
platform to provide realistic animation of the tongue and teeth for a virtual
character. The framework also provides an interface to articulatory animation
synthesis, as well as an example application to illustrate its use with a 3D
game engine. We rely on cross-platform, open-source software and open standards
to provide a lightweight, accessible, and portable workflow.Comment: Workshop on Innovation and Applications in Speech Technology (2012
Control concepts for articulatory speech synthesis
We present two concepts for the generation of gestural scores to control an articulatory speech synthesizer. Gestural scores are the common input to the synthesizer and constitute an or- ganized pattern of articulatory gestures. The first concept gen- erates the gestures for an utterance using the phonetic transcrip- tions, phone durations, and intonation commands predicted by the Bonn Open Synthesis System (BOSS) from an arbitrary in- put text. This concept extends the synthesizerto a text-to-speech synthesis system. The idea of the second concept is to use tim- ing informationextracted from ElectromagneticArticulography signals to generate the articulatory gestures. Therefore, it is a concept for the re-synthesis of natural utterances. Finally, ap- plication prospects for the presented synthesizer are discussed
Producing phrasal prominence in German
This study examines the relative change in a number of acoustic parameters usually associated with the production of prominences. The production of six German sentences under different question answer conditions provide de-accented and accented versions of the same words in broad and narrow focus. Normalised energy, F0, duration and spectral measures were found to form a stable hierarchy in their exponency of the three degrees of accentuation
The magnetic resonance imaging subset of the mngu0 articulatory corpus
Author version contains correctly encoded (Unicode) fonts and attached multimedia content.International audienceThis paper announces the availability of the magnetic resonance imaging (MRI) subset of the mngu0 corpus, a collection of articulatory speech data from one speaker containing different modalities. This subset comprises volumetric MRI scans of the speaker's vocal tract during sustained production of vowels and consonants, as well as dynamic mid-sagittal scans of repetitive consonant-vowel (CV) syllable production. For reference, high-quality acoustic recordings of the speech material are also available. The raw data are made freely available for research purposes
Phonetic accommodation to natural and synthetic voices : Behavior of groups and individuals in speech shadowing
The present study investigates whether native speakers of German phonetically accommodate to natural and synthetic voices in a shadowing experiment. We aim to determine whether this phenomenon, which is frequently found in HHI, also occurs in HCI involving synthetic speech. The examined features pertain to different phonetic domains: allophonic variation, schwa epenthesis, realization of pitch accents, word-based temporal structure and distribution of spectral energy. On the individual level, we found that the participants converged to varying subsets of the examined features, while they maintained their baseline behavior in other cases or, in rare instances, even diverged from the model voices. This shows that accommodation with respect to one particular feature may not predict the behavior with respect to another feature. On the group level, the participants of the natural condition converged to all features under examination, however very subtly so for schwa epenthesis. The synthetic voices, while partly reducing the strength of effects found for the natural voices, triggered accommodating behavior as well. The predominant pattern for all voice types was convergence during the interaction followed by divergence after the interaction
SISTEM OTOMASI KENDALI PINTU PERLINTASAN KERETA API BERBASIS SINYAL DTMF (Dual Tone Multi Frequency)
Kereta api merupakan salah satu jenis transportasi massal yang diminati oleh
masyarakat. Jaringan rel antar kota, terutama di Pulau Jawa sangat mendukung keberadaan
kereta api sebagai salah satu moda transportasi yang efektif dan efisien. Kecelakaan lalulintas
pada perlintasan rel kereta api sering terjadi akhir-akhir ini. Penyebab terjadinya kecelakaan
tersebut umumnya karena tidak adanya pintu perlintasan, atau kegagalan pintu menutup saat
dibutuhkan atau kegagalan operator untuk memerintahkan penutupan ( human error). Sistem
otomasi kendali pintu perlintasan merupakan solusi dari permasalahan tersebut. Pintu
perlintasan Kereta Api yang dapat membuka dan menutup dengan kontrol/ monitoring dari
stasiun merupakan metode termurah dan handal dapat meningkatkan jaminan keselamatan
dan keamanan para pengguna jalan. Oleh karena itu, rancang bangun suatu sistem otomasi
pintu perlintasan Kereta Api menjadi sangat signifikan. Pada rancang bangun sistem otomasi
kendali ini digunakan mikrokontroler ATmega8535 sebagai komponen pengontrol utama.
Telepon dan Dekoder DTMF sebagai pendeteksi kedatangan kereta api dan sensor optocopler
sebagai pendeteksi kereta telah lewat. Untuk membuka dan menutup pintu digunakan Motor
DC Servo sesuai dengan perintah mikrokontroler. Rangkaian sistem otomasi pintu lintasan rel
kereta api ini secara umum dibagi menjadi dua bagian, yaitu hardware dan software.
Hardware merupakan sistem perangkat keras yang meliputi sistem kontrol berupa
Mikrokontroller, motor, speaker, lampu sirine dan sensor, sedangkan software berupa
program yang menggunakan pemrograman bahasa BASIC. Dalam pengerjaannya bagi tiga
tahap yaitu tahapan perancangan , pembuatan dan uji coba alat. Uji coba alat dibagi menjadi
dua tahap, yaitu uji coba per bagian (part) dan uji coba secara keseluruhan. Hasil dari rancang
bangun ini diperoleh suatu alat kendali pintu perlintasan KA dengan menggunakan Sinyal
telepon dari stasiun untuk mendeteksi kedatangan kereta, mikrokontroler ATmega8535
sebagai pemroses input, dan DC motor servo sebagai penggerak pintu untuk membuka dan
menutup. Dengan adanya alat ini diharapkan dapat meminimalisasikan kecelakaan yang
terjadi pada perlintasan rel kereta api, sehingga dapat meningkatkan jaminan keselamatan dan
keamanan bagi para penumpang kereta api maupun para pengguna jalan
Observations on the dynamic control of an articulatory synthesizer using speech production data
This dissertation explores the automatic generation of gestural score based control structures for a three-dimensional articulatory speech synthesizer. The gestural scores are optimized in an articulatory resynthesis paradigm using a dynamic programming algorithm and a cost function which measures the deviation from a gold standard in the form of natural speech production data. This data had been recorded using electromagnetic articulography, from the same speaker to which the synthesizer\u27s vocal tract model had previously been adapted. Future work to create an English voice for the synthesizer and integrate it into a text-to-speech platform is outlined.Die vorliegende Dissertation untersucht die automatische Erzeugung von gesturalpartiturbasierten Steuerdaten fĂŒr ein dreidimensionales artikulatorisches Sprachsynthesesystem. Die gesturalen Partituren werden in einem artikulatorischen Resynthese-Paradigma mittels dynamischer Programmierung optimiert, unter Zuhilfenahme einer Kostenfunktion, die den Abstand zu einem "Gold Standard" in Form natĂŒrlicher Sprachproduktionsdaten miĂt. Diese Daten waren mit elektromagnetischer Artikulographie am selben Sprecher aufgenommen worden, an den zuvor das Vokaltraktmodell des Synthesesystems angepaĂt worden war. WeiterfĂŒhrende Forschung, eine englische Stimme fĂŒr das Synthesesystem zu erzeugen und sie in eine Text-to-Speech-Plattform einzubetten, wird umrissen
- âŠ