3 research outputs found

    Importance of the resonance frequencies of the vocal tract in estimating articulatory positions

    Get PDF
    La inversi贸n articulatoria, cuyo objetivo es estimar la posici贸n de los 贸rganos articuladores a partir de la informaci贸n contenida en la se帽al de voz, ofrece una variedad de potenciales aplicaciones en el campo de la voz; sin embargo, este es un problema a煤n por resolver. En este sentido, buscar representaciones con la capacidad de incrementar el desempe帽o de los sistemas de inversi贸n articulatoria es una tarea importante. El presente trabajo analiza la relevancia de los formantes como entrada para los sistemas de inversi贸n articulatoria. Para ello se implementa un an谩lisis anal铆tico y estad铆stico. En el caso anal铆tico se utiliza un sintetizador articulario, el cual simula la ecuaci贸n de tubos concatenados que modelan el tracto vocal. Para el an谩lisis estad铆stico se estudian datos reales provenientes de un articul贸grafo electromagn茅tico para los cuales se estima la asociaci贸n entre las caracter铆sticas ac煤sticas y los movimientos de los 贸rganos articuladores. A modo de medida de asociaci贸n estad铆stica se utiliza la medida de informaci贸n . Los resultados entregados por el an谩lisis son corroborados en un sistema de inversi贸n articulatoria basado en redes neuronales. Se observa una mejora en el valor de error cuadr谩tico medio del 2,2% y para el caso de la medida de desempe帽o de la correlaci贸n, una mejora del 2,8%.Acoustic-to-Articulatory inversion, which seeks to estimate an articulator position using the acoustic information in the speech signal, offers several potential applications in the field of speech processing. In this context, it is important to use acoustic parameters with the ability to increase the performance of acoustic-to-articulatory inversion systems. This paper analyzes the importance of formants as inputs to such inversion systems from an analytical and a statistical perspective. The former is based on an articulatory synthesizer that simulates the voice signal from the vocal tract. The statistical analysis is based on real data provided by an electromagnetic articulograph, for which we estimate the statistical association between acoustic features and articulator movement. As a measure of statistical association, the information measure is utilized. The results are tested on a neuralnetwork- based Acoustic-to-Articulatory inversion system. The use of formants as inputs led to an improvement of 2.2% and 2.8% in the root-mean-square error and correlation values, respectively

    Formant Trajectories for Acoustic-to-Articulatory Inversion

    No full text
    This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepstral (MFCC) observations is modified by using formants and energies to either replace or augment the MFCC observation vector. The augmented observation results in 3.4% lower RMS error, and 2% higher correlation coefficient, than the baseline MFCC observation. Improvement is especially good for stop consonants, possibly because formant tracking provides information about the acoustic resonances that would be otherwise unavailable during stop closure and release
    corecore