256 research outputs found

    Machine-learning applied to classify flow-induced sound parameters from simulated human voice

    Full text link
    Disorders of voice production have severe effects on the quality of life of the affected individuals. A simulation approach is used to investigate the cause-effect chain in voice production showing typical characteristics of voice such as sub-glottal pressure and of functional voice disorders as glottal closure insufficiency and left-right asymmetry. Therewith, 24 different voice configurations are simulated in a parameter study using a previously published hybrid aeroacoustic simulation model. Based on these 24 simulation configurations, selected acoustic parameters (HNR, CPP, ...) at simulation evaluation points are correlated with these simulation configuration details to derive characteristic insight in the flow-induced sound generation of human phonation based on simulation results. Recently, several institutions studied experimental data, of flow and acoustic properties and correlated it with healthy and disordered voice signals. Upon this, the study is a next step towards a detailed dataset definition, the dataset is small, but the definition of relevant characteristics are precise based on the existing simulation methodology of simVoice. The small datasets are studied by correlation analysis, and a Support Vector Machine classifier with RBF kernel is used to classify the representations. With the use of Linear Discriminant Analysis the dimensions of the individual studies are visualized. This allows to draw correlations and determine the most important features evaluated from the acoustic signals in front of the mouth. The GC type can be best discriminated based on CPP and boxplot visualizations. Furthermore and using the LDA-dimensionality-reduced feature space, one can best classify subglottal pressure with 91.7\% accuracy, independent of healthy or disordered voice simulation parameters.Comment: 17 pages, 11 figures, v0.1, work in progress, working pape

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Formant analysis in dysphonic patients and automatic Arabic digit speech recognition

    Get PDF
    <p>Abstract</p> <p>Background and objective</p> <p>There has been a growing interest in objective assessment of speech in dysphonic patients for the classification of the type and severity of voice pathologies using automatic speech recognition (ASR). The aim of this work was to study the accuracy of the conventional ASR system (with Mel frequency cepstral coefficients (MFCCs) based front end and hidden Markov model (HMM) based back end) in recognizing the speech characteristics of people with pathological voice.</p> <p>Materials and methods</p> <p>The speech samples of 62 dysphonic patients with six different types of voice disorders and 50 normal subjects were analyzed. The Arabic spoken digits were taken as an input. The distribution of the first four formants of the vowel /a/ was extracted to examine deviation of the formants from normal.</p> <p>Results</p> <p>There was 100% recognition accuracy obtained for Arabic digits spoken by normal speakers. However, there was a significant loss of accuracy in the classifications while spoken by voice disordered subjects. Moreover, no significant improvement in ASR performance was achieved after assessing a subset of the individuals with disordered voices who underwent treatment.</p> <p>Conclusion</p> <p>The results of this study revealed that the current ASR technique is not a reliable tool in recognizing the speech of dysphonic patients.</p

    Aerodynamic and Acoustic Features of Vocal Effort

    Get PDF
    Many voice disorders are associated with an effortful voice; however, there have been very few studies that have examined the physiological changes that contribute to this sense of effort. Determining the factors that contribute to change in vocal effort may help clinicians to effectively target these variables when working with people with voice disorders so that voice improvement is accompanied by decreased vocal effort after treatment. Prior research has shown that alterations in aerodynamic and acoustic variables are often associated with voice disorders involving increased muscular effort, and change in these variables is correlated with abnormal voice qualities. The current study focused on three main questions: 1) When producing speech with increased or decreased vocal effort as compared to comfortable vocal effort, how do healthy adults alter their phonatory physiology? 2) What are the acoustic manifestations of these changes in phonatory function that occur with high vocal effort? 3) Which aerodynamic or acoustic variables are the primary factors that are associated with an increase in vocal effort? The participants included 18 healthy men and women with normal voice and normal hearing, ranging in age from 18 to 26. After training, participants produced repeated syllable combinations at various levels of vocal effort (comfortable, maximal, and minimal). Aerodynamic and acoustic recordings were then analyzed. Three of the four aerodynamic measures in this study showed significant differences between the three vocal effort conditions, and reflected change in airflow, pressure, and rate of airflow change during voice production. Both acoustic measures, which related to the relative degree of harmonic energy in the speech signal also showed significant differences between the three vocal effort conditions

    Exploiting Wavelet and Prosody-related Features for the Detection of Voice Disorders

    Get PDF
    An approach for the detection of voice disorders exploiting wavelet and prosody-related properties of speech is presented in this paper. Based on the normalized energy contents of the Discrete Wavelet Transform (DWT) coefficients over all voice frames, several statistical measures are first determined. Then, the idea of some prosody-related voice properties, such as mean pitch, jitter and shimmer are utilized to compute similar statistical measures over all the frames. A set of statistical measures of the normalized energy contents of the DWT coefficients is combined with a set of statistical measures of the extracted prosody-related voice properties in order to form a feature vector to be used in both training and testing phases. Two categories of voice samples namely, healthy and disordered are considered here thus formulating the problem in the proposed method as a two-class problem to be solved. Finally, an Euclidean Distance based classifier is used to handle the feature vector for the purpose of detecting the disordered voice. A number of simulations is carried out and it is shown that the statistical analysis based on wavelet and prosody-related properties can effectively detect a variety of voice disorders from the mixture of healthy and disordered voices

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Differential specificity of acoustic measures to listener perception of voice quality

    Full text link
    The purpose of this project was to differentially examine the specificity of two acoustic measures, relative fundamental frequency (RFF) and the cepstral/spectral index of dysphonia (CSID), to listener perceptions of voice quality across four dimensions: breathiness, roughness, strain/vocal effort, and overall severity. An auditory perceptual experiment was conducted to estimate listener perception of said dimensions. The Pearson's correlation coefficient between RFF, CSID, and the perceptual ratings of voice quality was calculated in order to comment on the relationship between calculations of RFF and CSID and the current "gold standard" of listener perception. The hypothesis for this project was that measures of RFF would have a strong negative correlation with listener perception of strain/vocal effort, and that measures of CSID would have a strong positive correlation with listener perception of overall severity and breathiness. An unexpected result with a significant impact was found to be that listeners' ratings of the four voice qualities were highly correlated with one another. Unfortunately, the poorly differentiated perceptual ratings significantly impact the validity of this project in addition to hindering any reliability of its results. Thus overall, the correlations between measures of RFF, CSID, and distinct qualities of listener perception are rendered uninterpretable. Methodological considerations and future directions are henceforth reported

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis
    • …
    corecore