3 research outputs found
Importance of the resonance frequencies of the vocal tract in estimating articulatory positions
La inversi贸n articulatoria, cuyo objetivo es estimar la posici贸n de los 贸rganos articuladores a partir de la
informaci贸n contenida en la se帽al de voz, ofrece una variedad de potenciales aplicaciones en el campo de la voz; sin embargo,
este es un problema a煤n por resolver. En este sentido, buscar representaciones con la capacidad de incrementar el desempe帽o de
los sistemas de inversi贸n articulatoria es una tarea importante. El presente trabajo analiza la relevancia de los formantes como
entrada para los sistemas de inversi贸n articulatoria. Para ello se implementa un an谩lisis anal铆tico y estad铆stico. En el caso anal铆tico
se utiliza un sintetizador articulario, el cual simula la ecuaci贸n de tubos concatenados que modelan el tracto vocal. Para el an谩lisis
estad铆stico se estudian datos reales provenientes de un articul贸grafo electromagn茅tico para los cuales se estima la asociaci贸n entre
las caracter铆sticas ac煤sticas y los movimientos de los 贸rganos articuladores. A modo de medida de asociaci贸n estad铆stica se utiliza
la medida de informaci贸n . Los resultados entregados por el an谩lisis son corroborados en un sistema de inversi贸n articulatoria
basado en redes neuronales. Se observa una mejora en el valor de error cuadr谩tico medio del 2,2% y para el caso de la medida de
desempe帽o de la correlaci贸n, una mejora del 2,8%.Acoustic-to-Articulatory inversion, which seeks to estimate an articulator position using the acoustic information
in the speech signal, offers several potential applications in the field of speech processing. In this context, it is important to use
acoustic parameters with the ability to increase the performance of acoustic-to-articulatory inversion systems. This paper analyzes
the importance of formants as inputs to such inversion systems from an analytical and a statistical perspective. The former is
based on an articulatory synthesizer that simulates the voice signal from the vocal tract. The statistical analysis is based on real
data provided by an electromagnetic articulograph, for which we estimate the statistical association between acoustic features and
articulator movement. As a measure of statistical association, the information measure is utilized. The results are tested on a neuralnetwork-
based Acoustic-to-Articulatory inversion system. The use of formants as inputs led to an improvement of 2.2% and 2.8%
in the root-mean-square error and correlation values, respectively
Formant Trajectories for Acoustic-to-Articulatory Inversion
This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepstral (MFCC) observations is modified by using formants and energies to either replace or augment the MFCC observation vector. The augmented observation results in 3.4% lower RMS error, and 2% higher correlation coefficient, than the baseline MFCC observation. Improvement is especially good for stop consonants, possibly because formant tracking provides information about the acoustic resonances that would be otherwise unavailable during stop closure and release