Search CORE

578 research outputs found

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

Accurate glottal model parametrization by integrating audio and high-speed endoscopic video data

Author: Drioli Carlo
Foresti Gian Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/01/2014
Field of study

The aim of this paper is to evaluate the effectiveness of using video data for voice source parametrization in the representation of voice production through physical modeling. Laryngeal imaging techniques can be effectively used to obtain vocal fold video sequences and to derive time patterns of relevant glottal cues, such as folds edge position or glottal area. In many physically based numerical models of the vocal folds, these parameters are estimated from the inverse filtered glottal flow waveform, obtained from audio recordings of the sound pressure at lips. However, this model inversion process is often problematic and affected by accuracy and robustness issues. It is here discussed how video analysis of the fold vibration might be effectively coupled to the parametric estimation algorithms based on voice recordings, to improve accuracy and robustness of model inversio

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

Directory of Open Access Books (DOAB)

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

Directory of Open Access Books (DOAB)

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

Directory of Open Access Books (DOAB)

Models and Analysis of Vocal Emissions for Biomedical Applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

Directory of Open Access Books (DOAB)

Models and analysis of vocal emissions for biomedical applications

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

Directory of Open Access Books (DOAB)

On the use of voice descriptors for glottal source shape parameter estimation

Author: Huber Stefan
Röbel Axel
Publication venue: 'Elsevier BV'
Publication date: 08/10/2013
Field of study

International audienceThis paper summarizes the results of our investigations into estimating the shape of the glottal excitation source from speech signals. We employ the Liljencrants-Fant (LF) model describing the glottal flow and its derivative. The one-dimensional glottal source shape parameter Rd describes the transition in voice quality from a tense to a breathy voice. The parameter Rd has been derived from a statistical regression of the R waveshape parameters which parameterize the LF model. First, we introduce a variant of our recently proposed adaptation and range extension of the Rd parameter regression. Secondly, we discuss in detail the aspects of estimating the glottal source shape parameter Rd using the phase minimization paradigm. Based on the analysis of a large number of speech signals we describe the major conditions that are likely to result in erroneous Rd estimates. Based on these findings we investigate into means to increase the robustness of the Rd parameter estimation. We use Viterbi smoothing to suppress unnatural jumps of the estimated Rd parameter contours within short time segments. Additionally, we propose to steer the Viterbi algorithm by exploiting the covariation of other voice descriptors to improve Viterbi smoothing. The novel Viterbi steering is based on a Gaussian Mixture Model (GMM) that represents the joint density of the voice descriptors and the Open Quotient (OQ) estimated from corresponding electroglottographic (EGG) signals. A conversion function derived from the mixture model predicts OQ from the voice descriptors. Converted to Rd it defines an additional prior probability to adapt the partial probabilities of the Viterbi algorithm accordingly. Finally, we evaluate the performances of the phase minimization based methods using both variants to adapt and extent the Rd regression on one synthetic test set as well as in combination with Viterbi smoothing and each variant of the novel Viterbi steering on one test set of natural speech. The experimental findings exhibit improvements for both Viterbi approaches

Crossref

HAL Descartes

Hal-Diderot

Estimation of Subglottal Pressure, Vocal Fold Collision Pressure, and Intrinsic Laryngeal Muscle Activation From Neck-Surface Vibration Using a Neural Network Framework and a Voice Production Model

Author: Daryush D. Mehta
Emiro J. Ibarra
Emiro J. Ibarra
Gabriel A. Alzamendi
Jesús A. Parra
Juan P. Cortés
Juan P. Cortés
Matías Zañartu
Robert E. Hillman
Víctor M. Espinoza
Publication venue: 'Frontiers Media SA'
Publication date: 01/09/2021
Field of study

The ambulatory assessment of vocal function can be significantly enhanced by having access to physiologically based features that describe underlying pathophysiological mechanisms in individuals with voice disorders. This type of enhancement can improve methods for the prevention, diagnosis, and treatment of behaviorally based voice disorders. Unfortunately, the direct measurement of important vocal features such as subglottal pressure, vocal fold collision pressure, and laryngeal muscle activation is impractical in laboratory and ambulatory settings. In this study, we introduce a method to estimate these features during phonation from a neck-surface vibration signal through a framework that integrates a physiologically relevant model of voice production and machine learning tools. The signal from a neck-surface accelerometer is first processed using subglottal impedance-based inverse filtering to yield an estimate of the unsteady glottal airflow. Seven aerodynamic and acoustic features are extracted from the neck surface accelerometer and an optional microphone signal. A neural network architecture is selected to provide a mapping between the seven input features and subglottal pressure, vocal fold collision pressure, and cricothyroid and thyroarytenoid muscle activation. This non-linear mapping is trained solely with 13,000 Monte Carlo simulations of a voice production model that utilizes a symmetric triangular body-cover model of the vocal folds. The performance of the method was compared against laboratory data from synchronous recordings of oral airflow, intraoral pressure, microphone, and neck-surface vibration in 79 vocally healthy female participants uttering consecutive /pæ/ syllable strings at comfortable, loud, and soft levels. The mean absolute error and root-mean-square error for estimating the mean subglottal pressure were 191 Pa (1.95 cm H2O) and 243 Pa (2.48 cm H2O), respectively, which are comparable with previous studies but with the key advantage of not requiring subject-specific training and yielding more output measures. The validation of vocal fold collision pressure and laryngeal muscle activation was performed with synthetic values as reference. These initial results provide valuable insight for further vocal fold model refinement and constitute a proof of concept that the proposed machine learning method is a feasible option for providing physiologically relevant measures for laboratory and ambulatory assessment of vocal function.Fil: Ibarra, Emiro J.. Universidad Tecnica Federico Santa Maria.; ChileFil: Parra, Jesús A.. Universidad Tecnica Federico Santa Maria.; ChileFil: Alzamendi, Gabriel Alejandro. Universidad Nacional de Entre Ríos. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática; ArgentinaFil: Cortés, Juan P.. Universidad Tecnica Federico Santa Maria.; ChileFil: Espinoza, Víctor M.. Universidad de Chile; ChileFil: Mehta, Daryush D.. Center For Laryngeal Surgery And Voice Rehabilitation; Estados UnidosFil: Hillman, Robert E.. Center For Laryngeal Surgery And Voice Rehabilitation; Estados UnidosFil: Zañartu, Matías. Universidad Tecnica Federico Santa Maria.; Chil

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Directory of Open Access Journals

Simulation of vocal fold oscillation with a pseudo-one-mass physical model

Author: Alku
Avanzini
Avanzini
Berry
Berry
Childers
Childers
de Vries
de Vries
Deverge
Drioli
Fant
Fant
Federico Avanzini
Flanagan
Flanagan
Fletcher
Gunter
Ishizaka
Liljencrants
Lous
Lucero
Pelorson
Schroeter
Schroeter
Sciamarella
Sondhi
Story
Story
Strik
Titze
Titze
Titze
Titze
Vilain
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This paper presents a novel ``pseudo-one-mass model'' of the vocal folds, which is derived from a previously proposed two-mass model. Two-mass models account for effects of vertical phase differences in fold motion by means of a pair of coupled oscillators that describe the lower and upper fold portions. Instead, the proposed model employs a single mass-spring oscillator to describe only the oscillation of the lower fold portion, while phase difference effects are simulated through an approximate phenomenological description of the upper glottal area. This approximate description is derived in the hypothesis that

1:1

modal entrainment occurs between the two masses in the large-amplitude oscillation regime, and is then exploited to derive the equations of the pseudo-one-mass model. Numerical simulations of the reference two-mass model are analyzed to show that the proposed approximation remains valid when values of the physical parameters are varied in a large region of the control space. The effects on the shape of the glottal flow pulse are also analyzed. Comparison of simulations with the reference two-mass model and the pseudo-one-mass model show that the dynamic behavior of the former is accurately approximated by the latter. The similarity of flow signals synthesized with the two models is assessed in terms of four acoustic parameters: fundamental frequency, maximum amplitude, open quotient, and speed quotient. The results confirm that the pseudo-one-mass model fit with good accuracy the behavior of the reference two-mass model, while requiring significantly lower computational resources and roughly half of the mechanical parameters

Crossref

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università di Padova