186 research outputs found

    Using dysphonic voice to characterize speaker's biometry

    Get PDF
    Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading

    Dynamic characterization of vocal fold virbrations

    Get PDF
    An emerging trend among voice specialists is the use of quantitative protocols for the diagnosis and treatment of voice disorders. Vocal fold vibrations are directly related to voice quality. This research is devoted to providing an objective means of characterizing these vibrations. Our goal is to develop a dynamic model of vocal fold vibration, and map the parameter space of the model to a class of voice disorders; thus, furthering the assessment and diagnosis of voice disorder in clinical settings. To this end, this dissertation introduces a new seven-mass biomechanical model for the vibration of vocal folds. The model is based on the body-cover layer concept of the vocal fold biomechanics, and segments the cover layer into three masses along the longitudinal direction of the vocal fold. This segmentation facilitates the model comparison with the motion of the vocal glottis contour derived from modern high-speed digital imaging systems. The model simulation is compared to 14 sets of experimental data from human subjects with healthy vocal folds and pathological vocal folds including nodule, polyp, and unilateral paralysis. We also propose a semi-empirical two-stage procedure for tuning the parameters so that the model response matches as closely as possible the experimental data in the time and frequency domains. The first stage involves the manual coarse tuning of parameters based on limited data to expedite the process. The second stage is an automatic (or manual) fine tuning process on a subset of the parameters tuned in the first stage based on a larger amount of data. Once an ‘optimal’ set of model parameters has been identified, two model-based factors, quantifying the asymmetry between left and right vocal folds and anterior and posterior segments of the vocal folds, are introduced and calculated for each of the 14 cases. The two factors form an asymmetry plane. Based on the value of the asymmetry factors for the 14 cases, the plane is subdivided into four regions corresponding to healthy vocal folds, nodule, polyp, and unilateral paralysis. This yields a clear visual aid for clinicians, correlating the model parameters to voice quality

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Validation of a flow-structure-interaction computation model of phonation

    Get PDF
    Computational models of vocal fold (VF) vibration are becoming increasingly sophisticated, their utility currently transiting from exploratory research to predictive research. However, validation of such models has remained largely qualitative, raising questions over their applicability to interpret clinical situations. In this paper, a computational model with a segregated implementation is detailed. The model is used to predict the fluid–structure interaction(FSI) observed in a physical replica of the VFs when it is excited by airflow. Detailed quantitative comparisons are provided between the computational model and the corresponding experiment. First, the flow model is separately validated in the absence of VF motion. Then, in the presence of flow-induced VF motion, comparisons are made of the flow pressure on the VF walls and of the resulting VF displacements. Self-similarity of spatial distributions of flow pressure and VF displacements is highlighted. The self-similarity leads to normalized pressure and displacement profiles. It is shown that by using linear superposition of average and fluctuation components of normalized computed displacements, it is possible to determine displacements in the physical VF replica over a range of VF vibration conditions. Mechanical stresses in the VF interior are related to the VF displacements, thereby the computational model can also determine VF stresses over a range of phonation conditions

    Modeling and imaging of the vocal fold vibration for voice health.

    Get PDF

    A review of state-of-the-art speech modelling methods for the parameterisation of expressive synthetic speech

    Get PDF
    This document will review a sample of available voice modelling and transformation techniques, in view of an application in expressive unit-selection based speech synthesis in the framework of the PAVOQUE project. The underlying idea is to introduce some parametric modification capabilities at the level of the synthesis system, in order to compensate for the sparsity and rigidity, in terms of available emotional speaking styles, of the databases used to define speech synthesis voices. For this work, emotion-related parametric modifications will be restricted to the domains of voice quality and prosody, as suggested by several reviews addressing the vocal correlates of emotions (Schröder, 2001; Schröder, 2004; Roehling et al., 2006). The present report will start with a review of some techniques related to voice quality modelling and modification. First, it will explore the techniques related to glottal flow modelling. Then, it will review the domain of cross-speaker voice transformations, in view of a transposition to the domain of cross-emotion voice transformations. This topic will be exposed from the perspective of the parametric spectral modelling of speech and then from the perspective of available spectral transformation techniques. Then, the domain of prosodic parameterisation and modification will be reviewed

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Physiological and acoustic characteristics of the female musical theater voice in ‘belt’ and ‘legit’ qualities

    Get PDF
    ABSTRACT A study was conducted on six female Music Theatre singers. Audio and Electroglottographic (EGG) signals were recorded simultaneously with the vocal tract impedance while the singers produced sustained pitches on two different qualities ('chesty belt', 'legit'). For each quality, two vowels (/Ε/, /o/) were investigated, at four increasing pitches over the F#4-D5 range (~370-600 Hz). Measured values of glottal parameters (Open Quotient, Amplitude of the EGG signal) support the idea that 'chesty belt' is produced in the first laryngeal mechanism (M1) and 'legit' in the second one (M2). The frequency of the first vocal tract resonance (R1) was found to be systematically higher in 'chesty belt', close to the second voice harmonic (2f 0 ). These observations were consistent with greater intensities and energy above 1 kHz in 'chesty belt' compared to 'legit'
    corecore