2 research outputs found
A multilinear tongue model derived from speech related MRI data of the human vocal tract
We present a multilinear statistical model of the human tongue that captures
anatomical and tongue pose related shape variations separately. The model is
derived from 3D magnetic resonance imaging data of 11 speakers sustaining
speech related vocal tract configurations. The extraction is performed by using
a minimally supervised method that uses as basis an image segmentation approach
and a template fitting technique. Furthermore, it uses image denoising to deal
with possibly corrupt data, palate surface information reconstruction to handle
palatal tongue contacts, and a bootstrap strategy to refine the obtained
shapes. Our evaluation concludes that limiting the degrees of freedom for the
anatomical and speech related variations to 5 and 4, respectively, produces a
model that can reliably register unknown data while avoiding overfitting
effects. Furthermore, we show that it can be used to generate a plausible
tongue animation by tracking sparse motion capture data
High spatiotemporal cineMRI films using compressed sensing for acquiring articulatory data
International audienceThe paper presents a method to acquire articulatory data from a sequence of MRI images at a high framerate. The acquisition rate is enhanced by partially collecting data in the kt-space. The combination of compressed sensing technique, along with homodyne reconstruction, enables the missing data to be recovered. The good reconstruction is guaranteed by an appropriate design of the sampling pattern. It is based on a pseudo-random Cartesian scheme, where each line is partially acquired for use of the homodyne reconstruction, and where the lines are pseudo-randomly sampled: central lines are constantly acquired and the sampling density decreases as the lines are far from the center. Application on real speech data show that the framework enables dynamic sequences of vocal tract images to be recovered at a framerate higher than 30 frames per second and with a spatial resolution of 1 mm. A method to extract articulatory data from contour identification is presented. It is intended, in fine, to be used for the creation of a large database of articulatory data