578 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Accurate glottal model parametrization by integrating audio and high-speed endoscopic video data

    Get PDF
    The aim of this paper is to evaluate the effectiveness of using video data for voice source parametrization in the representation of voice production through physical modeling. Laryngeal imaging techniques can be effectively used to obtain vocal fold video sequences and to derive time patterns of relevant glottal cues, such as folds edge position or glottal area. In many physically based numerical models of the vocal folds, these parameters are estimated from the inverse filtered glottal flow waveform, obtained from audio recordings of the sound pressure at lips. However, this model inversion process is often problematic and affected by accuracy and robustness issues. It is here discussed how video analysis of the fold vibration might be effectively coupled to the parametric estimation algorithms based on voice recordings, to improve accuracy and robustness of model inversio

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    On the use of voice descriptors for glottal source shape parameter estimation

    Get PDF
    International audienceThis paper summarizes the results of our investigations into estimating the shape of the glottal excitation source from speech signals. We employ the Liljencrants-Fant (LF) model describing the glottal flow and its derivative. The one-dimensional glottal source shape parameter Rd describes the transition in voice quality from a tense to a breathy voice. The parameter Rd has been derived from a statistical regression of the R waveshape parameters which parameterize the LF model. First, we introduce a variant of our recently proposed adaptation and range extension of the Rd parameter regression. Secondly, we discuss in detail the aspects of estimating the glottal source shape parameter Rd using the phase minimization paradigm. Based on the analysis of a large number of speech signals we describe the major conditions that are likely to result in erroneous Rd estimates. Based on these findings we investigate into means to increase the robustness of the Rd parameter estimation. We use Viterbi smoothing to suppress unnatural jumps of the estimated Rd parameter contours within short time segments. Additionally, we propose to steer the Viterbi algorithm by exploiting the covariation of other voice descriptors to improve Viterbi smoothing. The novel Viterbi steering is based on a Gaussian Mixture Model (GMM) that represents the joint density of the voice descriptors and the Open Quotient (OQ) estimated from corresponding electroglottographic (EGG) signals. A conversion function derived from the mixture model predicts OQ from the voice descriptors. Converted to Rd it defines an additional prior probability to adapt the partial probabilities of the Viterbi algorithm accordingly. Finally, we evaluate the performances of the phase minimization based methods using both variants to adapt and extent the Rd regression on one synthetic test set as well as in combination with Viterbi smoothing and each variant of the novel Viterbi steering on one test set of natural speech. The experimental findings exhibit improvements for both Viterbi approaches

    Estimation of Subglottal Pressure, Vocal Fold Collision Pressure, and Intrinsic Laryngeal Muscle Activation From Neck-Surface Vibration Using a Neural Network Framework and a Voice Production Model

    Get PDF
    The ambulatory assessment of vocal function can be significantly enhanced by having access to physiologically based features that describe underlying pathophysiological mechanisms in individuals with voice disorders. This type of enhancement can improve methods for the prevention, diagnosis, and treatment of behaviorally based voice disorders. Unfortunately, the direct measurement of important vocal features such as subglottal pressure, vocal fold collision pressure, and laryngeal muscle activation is impractical in laboratory and ambulatory settings. In this study, we introduce a method to estimate these features during phonation from a neck-surface vibration signal through a framework that integrates a physiologically relevant model of voice production and machine learning tools. The signal from a neck-surface accelerometer is first processed using subglottal impedance-based inverse filtering to yield an estimate of the unsteady glottal airflow. Seven aerodynamic and acoustic features are extracted from the neck surface accelerometer and an optional microphone signal. A neural network architecture is selected to provide a mapping between the seven input features and subglottal pressure, vocal fold collision pressure, and cricothyroid and thyroarytenoid muscle activation. This non-linear mapping is trained solely with 13,000 Monte Carlo simulations of a voice production model that utilizes a symmetric triangular body-cover model of the vocal folds. The performance of the method was compared against laboratory data from synchronous recordings of oral airflow, intraoral pressure, microphone, and neck-surface vibration in 79 vocally healthy female participants uttering consecutive /pæ/ syllable strings at comfortable, loud, and soft levels. The mean absolute error and root-mean-square error for estimating the mean subglottal pressure were 191 Pa (1.95 cm H2O) and 243 Pa (2.48 cm H2O), respectively, which are comparable with previous studies but with the key advantage of not requiring subject-specific training and yielding more output measures. The validation of vocal fold collision pressure and laryngeal muscle activation was performed with synthetic values as reference. These initial results provide valuable insight for further vocal fold model refinement and constitute a proof of concept that the proposed machine learning method is a feasible option for providing physiologically relevant measures for laboratory and ambulatory assessment of vocal function.Fil: Ibarra, Emiro J.. Universidad Tecnica Federico Santa Maria.; ChileFil: Parra, Jesús A.. Universidad Tecnica Federico Santa Maria.; ChileFil: Alzamendi, Gabriel Alejandro. Universidad Nacional de Entre Ríos. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática; ArgentinaFil: Cortés, Juan P.. Universidad Tecnica Federico Santa Maria.; ChileFil: Espinoza, Víctor M.. Universidad de Chile; ChileFil: Mehta, Daryush D.. Center For Laryngeal Surgery And Voice Rehabilitation; Estados UnidosFil: Hillman, Robert E.. Center For Laryngeal Surgery And Voice Rehabilitation; Estados UnidosFil: Zañartu, Matías. Universidad Tecnica Federico Santa Maria.; Chil

    Simulation of vocal fold oscillation with a pseudo-one-mass physical model

    Full text link
    This paper presents a novel ``pseudo-one-mass model'' of the vocal folds, which is derived from a previously proposed two-mass model. Two-mass models account for effects of vertical phase differences in fold motion by means of a pair of coupled oscillators that describe the lower and upper fold portions. Instead, the proposed model employs a single mass-spring oscillator to describe only the oscillation of the lower fold portion, while phase difference effects are simulated through an approximate phenomenological description of the upper glottal area. This approximate description is derived in the hypothesis that 1:11:1 modal entrainment occurs between the two masses in the large-amplitude oscillation regime, and is then exploited to derive the equations of the pseudo-one-mass model. Numerical simulations of the reference two-mass model are analyzed to show that the proposed approximation remains valid when values of the physical parameters are varied in a large region of the control space. The effects on the shape of the glottal flow pulse are also analyzed. Comparison of simulations with the reference two-mass model and the pseudo-one-mass model show that the dynamic behavior of the former is accurately approximated by the latter. The similarity of flow signals synthesized with the two models is assessed in terms of four acoustic parameters: fundamental frequency, maximum amplitude, open quotient, and speed quotient. The results confirm that the pseudo-one-mass model fit with good accuracy the behavior of the reference two-mass model, while requiring significantly lower computational resources and roughly half of the mechanical parameters
    • …
    corecore