Search CORE

3 research outputs found

Construction and evaluation of an articulatory model of the vocal tract

Author: Busset Julie
Laprie Yves
Publication venue: HAL CCSD
Publication date: 29/08/2011
Field of study

International audienceArticulatory models of the vocal tract play an important role in the investigation of relations between the geometry of the vocal tract and its acoustic properties. This paper presents the construction and the evaluation of an articulatory model from a corpus of X-ray and MRI images, which approximates lateral vocal tract shapes of vowels and consonants with a very good precision. First, this paper describes the coordinate system used to represent the tongue contour and the strategy employed to find the deformation modes. Then, a speaker adaptation procedure is presented and the adapted model is evaluated on a second database of X-ray images. This evaluation shows that the model approximates tongue shapes with a very good precision. Finally, a centerline algorithm, i.e. an algorithm used to decompose the vocal tract in a sequence of elementary tubes, is presented

INRIA a CCSD electronic archive server

HAL Descartes

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Hal-Diderot

Acoustic-to-articulatory inversion by analysis-by-synthesis using cepstral coefficients

Author: Busset Julie
Laprie Yves
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceThis paper deals with acoustic to articulatory inversion of speech by using an analysis by synthesis approach. We used old X-ray films of one speaker to (i) the develop a linear articulatory model presenting a small geometric mismatch with the subject's vocal tract mid sagittal images (ii) and design an adaptation procedure of cepstral vectors used as input data. The adaptation exploits the bilinear transform to warp the frequency scale in order to compensate for deviation between synthetic and natural speech. This enables the comparison of natural speech against synthetic speech without using cepstral liftering. A codebook is used to represent the forward articulatory to acoustic mapping and we designed a loose matching algorithm using spectral peaks to access it. This algorithm, based on dynamic programming, allows some peaks in either synthetic spectra (stored in the codebook) or natural spectra (to be inverted) to be omitted. Quadratic programming is used to improve the acoustic proximity near each good candidate found during codebook exploration. The inversion has been tested on speech signals corresponding to the X-ray films. It achieves a very good geometric precision of 1.5 mm over the whole tongue shape unlike similar works evaluating the error at 3 or 4 points corresponding to sensors located at the front of the tongue

Crossref

INRIA a CCSD electronic archive server