Search CORE

13 research outputs found

Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging

Author: Csapó Tamás Gábor
Gosztolya Gábor
Honarmandi Shandiz Amin
Markó Alexandra
Németh Géza
Tóth László
Zainkó Csaba
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder

Author: Al-Radhi Mohammed Salah
Csapó Tamás Gábor
Gosztolya Gábor
Grósz Tamás
Markó Alexandra
Németh Géza
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2019
Field of study

Recently it was shown that within the Silent Speech Interface (SSI) field, the prediction of F0 is possible from Ultrasound Tongue Images (UTI) as the articulatory input, using Deep Neural Networks for articulatory-to-acoustic mapping. Moreover, text-to-speech synthesizers were shown to produce higher quality speech when using a continuous pitch estimate, which takes non-zero pitch values even when voicing is not present. Therefore, in this paper on UTI-based SSI, we use a simple continuous F0 tracker which does not apply a strict voiced / unvoiced decision. Continuous vocoder parameters (ContF0, Maximum Voiced Frequency and Mel-Generalized Cepstrum) are predicted using a convolutional neural network, with UTI as input. The results demonstrate that during the articulatory-to-acoustic mapping experiments, the continuous F0 is predicted with lower error, and the continuous vocoder produces slightly more natural synthesized speech than the baseline vocoder using standard discontinuous F0.Comment: 5 pages, 3 figures, accepted for publication at Interspeech 201

arXiv.org e-Print Archive

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library

Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging

Author: Csapó Tamás Gábor
Gosztolya Gábor
Honarmandi Shandiz Amin
Markó Alexandra
Németh Géza
Tóth László
Zainkó Csaba
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

For articulatory-to-acoustic mapping, typically only limited parallel training data is available, making it impossible to apply fully end-to-end solutions like Tacotron2. In this paper, we experimented with transfer learning and adaptation of a Tacotron2 text-to-speech model to improve the final synthesis quality of ultrasound-based articulatory-to-acoustic mapping with a limited database. We use a multi-speaker pre-trained Tacotron2 TTS model and a pre-trained WaveGlow neural vocoder. The articulatory-to-acoustic conversion contains three steps: 1) from a sequence of ultrasound tongue image recordings, a 3D convolutional neural network predicts the inputs of the pre-trained Tacotron2 model, 2) the Tacotron2 model converts this intermediate representation to an 80-dimensional mel-spectrogram, and 3) the WaveGlow model is applied for final inference. This generated speech contains the timing of the original articulatory data from the ultrasound recording, but the F0 contour and the spectral information is predicted by the Tacotron2 model. The F0 values are independent of the original ultrasound images, but represent the target speaker, as they are inferred from the pre-trained Tacotron2 model. In our experiments, we demonstrated that the synthesized speech quality is more natural with the proposed solutions than with our earlier model.Comment: accepted at SSW11. arXiv admin note: text overlap with arXiv:2008.0315

arXiv.org e-Print Archive

Repository of the Academy's Library

Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging

Author: Csapó Tamás Gábor
Gosztolya Gábor
Honarmandi Shandiz Amin
Markó Alexandra
Németh Géza
Tóth László
Zainkó Csaba
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library

ρan-ρan

Author: Hogg Norman
Mulholland Neil
Publication venue: punctum books
Publication date: 01/07/2021
Field of study

"With the peristaltic gurglings of this gastēr-investigative procedural – a soooo welcomed addition to the ballooning corpus of slot-versatile bad eggs The Confraternity of Neoflagellants (CoN) – [users] and #influencers everywhere will be belly-joyed to hold hands with neomedieval mutter-matter that literally sticks and branches, available from punctum in both frictionless and grip-gettable boke-shaped formats. A game-changer in Brownian temp-controlled phoneme capture, ρan-ρan’s writhing paginations are completely oxygen-soaked, overwriting the flavour profiles of 2013’s thN Lng folk 2go with no-holds-barred argumentations on all voice-like and lung-adjacent functions. Rumoured by experts to be dead to the World™, CoN has clearly turned its ear canal arrays towards the jabbering OMFG feedback signals from their scores of naive listeners, scrapping all lenticular exegesis and content profiles to construct taped-together vernacular dwellings housing ‘shrooming atmospheric awarenesses and pan-dimensional cross-talkers, making this anticipatory sequel a serious competitor across ambient markets, and a crowded kitchen in its own right. An utterly mondegreen-infested deep end may deter would-be study buddies from taking the plunge, but feet-wetted Dog Heads eager to sniff around for temporal folds and whiff past the stank of hastily proscribed future fogs ought to ©k no further than the roll-upable-rim of ρan-ρan’s bleeeeeding premodern lagoon. Arrange yerself cannonball-wise or lead with the #gut and you’ll be kersplashing in no times.

OAPEN Library

Edinburgh Research Explorer

Directory of Open Access Books (DOAB)

Play Among Books

Author: _ch3n81 Alice
Roman Miro
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 11/01/2022
Field of study

How does coding change the way we think about architecture? Miro Roman and his AI Alice_ch3n81 develop a playful scenario in which they propose coding as the new literacy of information. They convey knowledge in the form of a project model that links the fields of architecture and information through two interwoven narrative strands in an “infinite flow” of real books

Directory of Open Access Books (DOAB)

Play Among Books

Author: _ch3n81 Alice
Roman Miro
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

OAPEN Library