3 research outputs found

    Construction and evaluation of an articulatory model of the vocal tract

    Get PDF
    International audienceArticulatory models of the vocal tract play an important role in the investigation of relations between the geometry of the vocal tract and its acoustic properties. This paper presents the construction and the evaluation of an articulatory model from a corpus of X-ray and MRI images, which approximates lateral vocal tract shapes of vowels and consonants with a very good precision. First, this paper describes the coordinate system used to represent the tongue contour and the strategy employed to find the deformation modes. Then, a speaker adaptation procedure is presented and the adapted model is evaluated on a second database of X-ray images. This evaluation shows that the model approximates tongue shapes with a very good precision. Finally, a centerline algorithm, i.e. an algorithm used to decompose the vocal tract in a sequence of elementary tubes, is presented

    Acoustic-to-articulatory inversion by analysis-by-synthesis using cepstral coefficients

    Get PDF
    International audienceThis paper deals with acoustic to articulatory inversion of speech by using an analysis by synthesis approach. We used old X-ray films of one speaker to (i) the develop a linear articulatory model presenting a small geometric mismatch with the subject's vocal tract mid sagittal images (ii) and design an adaptation procedure of cepstral vectors used as input data. The adaptation exploits the bilinear transform to warp the frequency scale in order to compensate for deviation between synthetic and natural speech. This enables the comparison of natural speech against synthetic speech without using cepstral liftering. A codebook is used to represent the forward articulatory to acoustic mapping and we designed a loose matching algorithm using spectral peaks to access it. This algorithm, based on dynamic programming, allows some peaks in either synthetic spectra (stored in the codebook) or natural spectra (to be inverted) to be omitted. Quadratic programming is used to improve the acoustic proximity near each good candidate found during codebook exploration. The inversion has been tested on speech signals corresponding to the X-ray films. It achieves a very good geometric precision of 1.5 mm over the whole tongue shape unlike similar works evaluating the error at 3 or 4 points corresponding to sensors located at the front of the tongue
    corecore