19 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Influence of analyzed sequence length on parameters in laryngeal high-speed videoendoscopy

    Get PDF
    Laryngeal high-speed videoendoscopy (HSV) allows objective quantification of vocal fold vibratory characteristics. However, it is unknown how the analyzed sequence length affects some of the computed parameters. To examine if varying sequence lengths influence parameter calculation, 20 HSV recordings of healthy females during sustained phonation were investigated. The clinical prevalent Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512 x 256 pixels was used to collect HSV data. The glottal area waveform (GAW), describing the increase and decrease of the area between the vocal folds during phonation, was extracted. Based on the GAW, 16 perturbation parameters were computed for sequences of 5, 10, 20, 50 and 100 consecutive cycles. Statistical analysis was performed using SPSS Statistics, version 21. Only three parameters (18.8%) were statistically significantly influenced by changing sequence lengths. Of these parameters, one changed until 10 cycles were reached, one until 20 cycles were reached and one, namely Amplitude Variability Index (AVI), changed between almost all groups of different sequence lengths. Moreover, visually observable, but not statistically significant, changes within parameters were observed. These changes were often most prominent between shorter sequence lengths. Hence, we suggest using a minimum sequence length of at least 20 cycles and discarding the parameter AVI

    Models of glottal configuration associated to presbyphonia

    Get PDF

    Pan European Voice Conference - PEVOC 11

    Get PDF
    The Pan European VOice Conference (PEVOC) was born in 1995 and therefore in 2015 it celebrates the 20th anniversary of its establishment: an important milestone that clearly expresses the strength and interest of the scientific community for the topics of this conference. The most significant themes of PEVOC are singing pedagogy and art, but also occupational voice disorders, neurology, rehabilitation, image and video analysis. PEVOC takes place in different European cities every two years (www.pevoc.org). The PEVOC 11 conference includes a symposium of the Collegium Medicorum Theatri (www.comet collegium.com

    Impact of human vocal fold vibratory asymmetries on acoustic characteristics of sustained vowel phonation

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 127-132).Clinical voice specialists make critical diagnostic, medical, therapeutic, and surgical decisions by coupling visual observations of vocal fold tissue motion with auditory-perceptual assessments of voice quality. The details of the relationship between vocal fold tissue motion and the voice produced are not fully understood, and there is recent evidence that the diagnostic significance of asymmetries during vocal fold vibration may be over-interpreted during clinical voice assessment. An automated system based on high-speed videoendoscopy recordings was developed to objectively quantify vocal fold vibratory asymmetry with initial validation from manual markings and visualperceptual judgments. Efficient estimation of these measures was possible due to recent technological advances in high-speed imaging of the larynx that enabled the capture and processing of high-resolution video (up to 10,000 images per second) of rapid vocal fold vibrations (100-1000 times per second). Synchronized recordings of the acoustic voice signal were made to explore physiological-acoustic relationships that were not possible using clinical stroboscopic imaging systems. In an initial study of asymmetric vibration in 14 patients treated for laryngeal cancer, perturbations in the voice signal were most associated with asymmetry that changed across vibratory cycles, while the overall level of asymmetry did not contribute to degradations in voice quality measures.(cont.) Thus, since stroboscopic imaging is only able to capture vibratory asymmetry that occurs periodically, voice clinicians are not able to observe the time-varying nature of asymmetry that presumably affects acoustic perturbations to a higher degree. The impact of asymmetric vibration on spectral characteristics was explored in a computational voice production model and an expanded group of 47 human subjects. Surprisingly, in both model and subject data, measures of vocal fold vibratory asymmetry did not correlate with spectral tilt measures. In the subject data, left-right phase asymmetry and closing quotient exhibited a mild inverse correlation. This result conflicted with model simulations in which the glottal area waveform exhibited higher closing quotients (less abrupt glottal closure) with increasing levels of phase asymmetry. Results call for further studies into the applicability of traditional spectral tilt measures and the role of asymmetric vocal fold vibration in efficient voice production.by Daryush Dinyar Mehta.Ph.D

    Automated measures of dysphonias and the phonatory effects of asymmetries in the posterior larynx

    Get PDF

    Acoustic and videoendoscopic techniques to improve voice assessment via relative fundamental frequency

    Get PDF
    Quantitative measures of laryngeal muscle tension are needed to improve assessment and track clinical progress. Although relative fundamental frequency (RFF) shows promise as an acoustic estimate of laryngeal muscle tension, it is not yet transferable to the clinic. The purpose of this work was to refine algorithmic estimation of RFF, as well as to enhance the knowledge surrounding the physiological underpinnings of RFF. The first study used a large database of voice samples collected from 227 speakers with voice disorders and 256 typical speakers to evaluate the effects of fundamental frequency estimation techniques and voice sample characteristics on algorithmic RFF estimation. By refining fundamental frequency estimation using the Auditory Sawtooth Waveform Inspired Pitch Estimator—Prime (Auditory-SWIPE′) algorithm and accounting for sample characteristics via the acoustic measure, pitch strength, algorithmic errors related to the accuracy and precision of RFF were reduced by 88.4% and 17.3%, respectively. The second study sought to characterize the physiological factors influencing acoustic outputs of RFF estimation. A group of 53 speakers with voice disorders and 69 typical speakers each produced the utterance, /ifi/, while simultaneous recordings were collected using a microphone and flexible nasendoscope. Acoustic features calculated via the microphone signal were examined in reference to the physiological initiation and termination of vocal fold vibration. The features that corresponded with these transitions were then implemented into the RFF algorithm, leading to significant improvements in the precision of the RFF algorithm to reflect the underlying physiological mechanisms for voicing offsets (p < .001, V = .60) and onsets (p < .001, V = .54) when compared to manual RFF estimation. The third study further elucidated the physiological underpinnings of RFF by examining the contribution of vocal fold abduction to RFF during intervocalic voicing offsets. Vocal fold abductory patterns were compared to RFF values in a subset of speakers from the second study, comprising young adults, older adults, and older adults with Parkinson’s disease. Abductory patterns were not significantly different among the three groups; however, vocal fold abduction was observed to play a significant role in measures of RFF at voicing offset. By improving algorithmic estimation and elucidating aspects of the underlying physiology affecting RFF, this work adds to the utility of RFF for use in conjunction with current clinical techniques to assess laryngeal muscle tension.2021-09-29T00:00:00

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Glottal-synchronous speech processing

    No full text
    Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

    Efficacy of Lugol’s iodine in the evaluation of vocal cord neoplasm.

    Get PDF
    Cancer of the larynx is the second most common malignancy of the upper aerodigestive tract (UADT). Even though large varieties of malignancies are reported in the larynx, 90% of them are Squamous Cell Carcinoma (SCC) which arises from the epithelial lining of the larynx. The most common site of laryngeal carcinoma is the glottis. About 90% of malignant tumors of the larynx are carcinomas that often develop from premalignant lesions. Therefore, early detection and prompt treatment should thus prevent the development of invasive cancer requiring more debilitating surgical resection. It is very difficult to predict accurately which lesions will progress to invasive malignancy based only on clinical appearance. Studies have proven that the clinical appearance bears little correlation with the underlying pathology. What makes decision making difficult is that simple hyperplasia, dysplasia, and or carcinoma can all coexist in same lesion. Even, stroboscopy has not proved to be reliable method of determining the presence of malignancy or depth of invasion. At present there is no standard test to identify benign lesions from premalignant or malignant ones. Objective To evaluate a commonly available, easily applicable and cost effective method to diagnose the presence of pre malignant and malignant vocal cord neoplasm. Aim To observe the staining property of Lugol’s iodine in various vocal cord lesions. To assess the reliability of Lugol’s iodine in the evaluation of pre malignant and malignant vocal cord lesions. The hypothesis is that Lugol’s iodine “stains” normal epithelium and benign lesions where as pre malignant and malignant lesions remain unstained
    corecore