Search CORE

2 research outputs found

A multilinear tongue model derived from speech related MRI data of the human vocal tract

Author: Alexander Hewer
Allen
Ananthakrishnan
Badin
Badin
Badin
Baer
Beautemps
Bijar
Blandin
Blanz
Bolkart
Botsch
Brunner
Buchaillard
Buchaillard
Burdumy
De Silva
Demolin
Dryden
Elie
Engwall
Engwall
Engwall
Eryildirim
Fang
Foldvik
Fu
Fuchs
Geng
Harandi
Harandi
Harshman
Harshman
Hewer
Hewer
Honda
Hoole
Hoole
Ingmar Steiner
International Phonetic Association
Jackson
Johnson
Kaburagi
Kiers
Kim
Korin Richmond
Kröger
Ladefoged
Ladefoged
Le Maguer
Lee
Li
Lingala
Lingala
Liu
McGurk
Mermelstein
Narayanan
Narayanan
Narayanan
Niebergall
Otsu
Peng
Raeesy
Richmond
Rodrigues
Rosset
Rudy
Scott
Serrurier
Shadle
Stefanie Wuhrer
Steiner
Stone
Stone
Stone
Styner
Tiede
Toutios
Tucker
Valdés Vargas
Valdés Vargas
Weickert
Weirich
Weirich
Woo
Woo
Wu
Yunusova
Zheng
Publication venue: 'Elsevier BV'
Publication date: 21/02/2018
Field of study

We present a multilinear statistical model of the human tongue that captures anatomical and tongue pose related shape variations separately. The model is derived from 3D magnetic resonance imaging data of 11 speakers sustaining speech related vocal tract configurations. The extraction is performed by using a minimally supervised method that uses as basis an image segmentation approach and a template fitting technique. Furthermore, it uses image denoising to deal with possibly corrupt data, palate surface information reconstruction to handle palatal tongue contacts, and a bootstrap strategy to refine the obtained shapes. Our evaluation concludes that limiting the degrees of freedom for the anatomical and speech related variations to 5 and 4, respectively, produces a model that can reliably register unknown data while avoiding overfitting effects. Furthermore, we show that it can be used to generate a plausible tongue animation by tracking sparse motion capture data

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

High spatiotemporal cineMRI films using compressed sensing for acquiring articulatory data

Author: Elie Benjamin
Laprie Yves
Odille Freddy
Vuissoz Pierre-André
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/08/2016
Field of study

International audienceThe paper presents a method to acquire articulatory data from a sequence of MRI images at a high framerate. The acquisition rate is enhanced by partially collecting data in the kt-space. The combination of compressed sensing technique, along with homodyne reconstruction, enables the missing data to be recovered. The good reconstruction is guaranteed by an appropriate design of the sampling pattern. It is based on a pseudo-random Cartesian scheme, where each line is partially acquired for use of the homodyne reconstruction, and where the lines are pseudo-randomly sampled: central lines are constantly acquired and the sampling density decreases as the lines are far from the center. Application on real speech data show that the framework enables dynamic sequences of vocal tract images to be recovered at a framerate higher than 30 frames per second and with a spatial resolution of 1 mm. A method to extract articulatory data from contour identification is presented. It is intended, in fine, to be used for the creation of a large database of articulatory data

Crossref

HAL-Inserm

INRIA a CCSD electronic archive server