Search CORE

1 research outputs found

Data-driven vocal folds models for the representation of both acoustic and high speed video data

Author: Carlo Drioli
Gian Luca Foresti
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2015
Field of study

2noThe aim of this paper is to evaluate the effectiveness of a class of data-driven physical models to represent both acoustic and high-speed video data of the voice production process. Voice production analysis through numerical models of the phonation process is nowday a mature research field, and reliable dynamical glottal models of different accuracy and complexity are available. Although they are traditionally used to represent the acoustic emission during phonation, the biomechanical nature of the modeling makes them well suited to also represent high speed video recordings of the vocal folds oscillations. We discuss here a data-driven, numerically simulated model of the folds motion within an audio-video data analysis context. A model structure is proposed which is based on physical knowledge and data-driven machine learning components. A model inversion algorithm is designed that exploits acoustic data related to the glottal excitation and high speed video data of the folds, to estimate the parameters of the model and to represent the phonation characteristics. It is shown here how machine learning techniques can be effectively used in combination to biomechanical modeling, in order to fit match the osbserved data. The method is assessed on data from different subjects uttering sustained vowelsreservedmixedDrioli, Carlo; Foresti, Gian LucaDrioli, Carlo; Foresti, Gian Luc

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine