731 research outputs found

    Vocal Fold Analysis From High Speed Videoendoscopic Data

    Get PDF
    High speed videoendoscopy (HSV) of the larynx far surpasses the limits of videostroboscopy in evaluating the vocal fold vibratory behavior by providing much higher frame rate. HSV enables the visualization of vocal fold vibratory pattern within an actual glottic cycle. This very detailed infor-mation on vocal fold vibratory characteristics could provide valuable information for the assessment of vocal fold vibratory function in disordered voices and the treatments effects of the behavioral, medical and surgical treatment procedures. In this work, we aim at addressing the problem of classi-fying voice disorders with varying etiology by following four steps described shortly. Our method-ology starts with glottis segmentation. Given a HSV data, the contour of the glottal opening area in each frame should be acquired. These contours record the vibration track of the vocal fold. After this, we obtain a reliable glottal axis that is necessary for getting certain vibratory features. The third step is the feature extraction on HSV data. In the last step, we complete the classification based on the features obtained from step 3. In this study, we first propose a novel glottis segmentation method based on simplified dynam-ic programming, which proves to be efficient and accurate. In addition, we introduce a new ap-proach for calculating the glottal axis. By comparing the proposed glottal axis determination meth-ods (modified linear regression) against state-of-the-art techniques, we demonstrate that our tech-nique is more reliable. After that, the concentration shifts to feature extraction and classification schemes. Eighteen different features are extracted and their discrimination is evaluated based on principal component analysis. Support vector machine and neural network are implemented to achieve the classification among three different types of vocal folds(normal vocal fold, unilateral vocal fold polyp, and unilateral vocal fold paralysis). The result demonstrates that the classification rates of four different tasks are all above 80%

    Impact of human vocal fold vibratory asymmetries on acoustic characteristics of sustained vowel phonation

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 127-132).Clinical voice specialists make critical diagnostic, medical, therapeutic, and surgical decisions by coupling visual observations of vocal fold tissue motion with auditory-perceptual assessments of voice quality. The details of the relationship between vocal fold tissue motion and the voice produced are not fully understood, and there is recent evidence that the diagnostic significance of asymmetries during vocal fold vibration may be over-interpreted during clinical voice assessment. An automated system based on high-speed videoendoscopy recordings was developed to objectively quantify vocal fold vibratory asymmetry with initial validation from manual markings and visualperceptual judgments. Efficient estimation of these measures was possible due to recent technological advances in high-speed imaging of the larynx that enabled the capture and processing of high-resolution video (up to 10,000 images per second) of rapid vocal fold vibrations (100-1000 times per second). Synchronized recordings of the acoustic voice signal were made to explore physiological-acoustic relationships that were not possible using clinical stroboscopic imaging systems. In an initial study of asymmetric vibration in 14 patients treated for laryngeal cancer, perturbations in the voice signal were most associated with asymmetry that changed across vibratory cycles, while the overall level of asymmetry did not contribute to degradations in voice quality measures.(cont.) Thus, since stroboscopic imaging is only able to capture vibratory asymmetry that occurs periodically, voice clinicians are not able to observe the time-varying nature of asymmetry that presumably affects acoustic perturbations to a higher degree. The impact of asymmetric vibration on spectral characteristics was explored in a computational voice production model and an expanded group of 47 human subjects. Surprisingly, in both model and subject data, measures of vocal fold vibratory asymmetry did not correlate with spectral tilt measures. In the subject data, left-right phase asymmetry and closing quotient exhibited a mild inverse correlation. This result conflicted with model simulations in which the glottal area waveform exhibited higher closing quotients (less abrupt glottal closure) with increasing levels of phase asymmetry. Results call for further studies into the applicability of traditional spectral tilt measures and the role of asymmetric vocal fold vibration in efficient voice production.by Daryush Dinyar Mehta.Ph.D

    Dynamic characterization of vocal fold virbrations

    Get PDF
    An emerging trend among voice specialists is the use of quantitative protocols for the diagnosis and treatment of voice disorders. Vocal fold vibrations are directly related to voice quality. This research is devoted to providing an objective means of characterizing these vibrations. Our goal is to develop a dynamic model of vocal fold vibration, and map the parameter space of the model to a class of voice disorders; thus, furthering the assessment and diagnosis of voice disorder in clinical settings. To this end, this dissertation introduces a new seven-mass biomechanical model for the vibration of vocal folds. The model is based on the body-cover layer concept of the vocal fold biomechanics, and segments the cover layer into three masses along the longitudinal direction of the vocal fold. This segmentation facilitates the model comparison with the motion of the vocal glottis contour derived from modern high-speed digital imaging systems. The model simulation is compared to 14 sets of experimental data from human subjects with healthy vocal folds and pathological vocal folds including nodule, polyp, and unilateral paralysis. We also propose a semi-empirical two-stage procedure for tuning the parameters so that the model response matches as closely as possible the experimental data in the time and frequency domains. The first stage involves the manual coarse tuning of parameters based on limited data to expedite the process. The second stage is an automatic (or manual) fine tuning process on a subset of the parameters tuned in the first stage based on a larger amount of data. Once an ‘optimal’ set of model parameters has been identified, two model-based factors, quantifying the asymmetry between left and right vocal folds and anterior and posterior segments of the vocal folds, are introduced and calculated for each of the 14 cases. The two factors form an asymmetry plane. Based on the value of the asymmetry factors for the 14 cases, the plane is subdivided into four regions corresponding to healthy vocal folds, nodule, polyp, and unilateral paralysis. This yields a clear visual aid for clinicians, correlating the model parameters to voice quality

    Methods and studies of laryngeal voice quality analysis in speech production

    Get PDF
    Voice quality, defined by John Laver as the characteristic auditory colouring of a speaker's voice, is a significant feature of speech, and it is used to signal various properties such as emotions, intentions, and mood of the speaker. While voice quality measurement techniques and algorithms have been developed, much work is needed to obtain a comprehensive view of the function and analysis of human voice in the production of different voice qualities. Two major research questions are presented in this thesis: First, how can the most important laryngeal voice quality features be analyzed, and second, how do the voice quality features affect different facets of vocal expression? To answer these questions, five separate studies of the analysis methodology and two studies regarding the voice quality behaviour were published. The methodology articles describe a voice source analysis software package; a comparison of multiple voice source parameters in breathy, normal, and pressed phonation; a method for evaluating inverse filtering algorithms; comparison of two inverse filtering algorithms; and a method for analyzing intensity regulation of speech. One analysis article studies changes in the laryngeal voice quality when different emotions are expressed in speech and another voice quality changes in expression of prominence in continuous speech. The methodology studies resulted in new tools, methods, and guidelines for voice source analysis, while the analysis studies provide information on how voice quality is used in expressive speech

    Glottis Detection and Evaluation in High-Speed Video Recording

    Get PDF
    Tato práce shrnuje výsledky studia zabývajícího se hodnocením hlasivek na základě dat získaných ze záznamů pořízených laryngoskopickým systémem, konkrétně laryngeální vysokorychlostní videoendoskopií (Laryngeal High-Speed Videoendoscopy -- LHSV) Hlavním cílem této práce je zpracovat obrazovou informaci, která je obsažena ve videosekvencích LHSV, najít a detekovat hlasivkovou štěrbinu (glottis) zvolenými metodami segmentace obrazu a vyhodnotit kvalitu hlasivek analytickými a statistickými metodami s využitím definovaného souboru parametrů. První část této práce se zaměřuje na popis podstaty a struktury informace, která je získána pomocí systému LHSV. Proto je zde popsána anatomie hlasivek a fyziologie vzniku hlasu, to vše ve vztahu k informacím obsažených ve snímku v záznamu LHSV. Také jsou uvedeny základní typy onemocnění hlasivek a doplněn popis získávání dat, jejich struktura a poruchové jevy, které ovlivňují kvalitu záznamu LHSV. Dále je popsána problematika segmentace obrazu použitá na získaných obrazových datech z vyšetření pomoci LHSV a jsou shrnuté metody vyvinuté pro lokalizaci glottis, tzv. nalezení oblasti zájmu (Region of Interest -- ROI), samotnou segmentaci a výběr parametrů založených především na geometrii a symetrii hlasivek. Proces je demonstrován na několika kazuistikách. Důležitou částí práce je popis nových metod zabývajících se vypočítanými parametry a jejich vztahy pomocí korelační analýzy. Přístup založený na očekávaných a neočekávaných korelačních vztazích vyplývajících z podrobné analýzy může poskytnout základní hodnocení chování hlasivek. Další metody pak poskytují numerické hodnocení vývoje tvaru hlasivkové štěrbiny na základě statistické analýzy a expertního hodnocení. Výsledky jsou ilustrovány a vysvětleny.ObhájenoThis work summarizes the results of the study of vocal cords evaluation based on data extracted from recordings taken by a laryngoscopic system, specifically by Laryngeal High-Speed Videoendoscopy (LHSV). The main goal of this work is to process images contained in the recorded LHSV sequences, find and detect the vocal gap (glottis) using chosen image segmentation methods and evaluate the vocal cords' quality by analytical and statistical methods using a defined set of parameters. The first part of this thesis focuses on the description of the nature and structure of the information that is obtained using the LHSV system. Therefore, the anatomy of the vocal cords and the physiology of voice creation are described concerning the information included in the image in the LHSV recording. Also, the basic types of vocal cords diseases are listed and the data gathering, structure, and problems affecting the quality of the LHSV recording are described. Furthermore, issues of image segmentation used on laryngoscopical image data taken from Laryngeal High-Speed Videoendoscopy are delineated together with a description of the developed method for glottis localization (finding ROI), segmentation, and parameter selection mainly based on geometry and glottis symmetry. The process is demonstrated in several case studies. The important part of the work contains a description of new methods dealing with computed parameters and their relationships using correlation analysis. An approach based on expected and unexpected correlation relations resulting from the detailed analysis can provide a basic evaluation of the vocal cords' behavior. Other methods then provide a numeric evaluation of the glottis shape development based on statistical analysis and rating from the experts' examinations. The results are illustrated and explained

    GROWING OLD AS A ROCK STAR: A FOUR-PART STUDY OF THE AGING VOICE

    Get PDF
    This dissertation focuses on the aging voice - specifically the aging elite vocal athlete. It is comprised of four components; a series of research studies and a viewpoint piece designed to explore the awareness, struggles, and vocal compensations of aging singers dealing with age related vocal and performance problems. The overarching goal of these studies is to inform the development of a voice care protocol for the aging rock star to guide customized intervention for these elite vocal athletes that is focused on optimizing both vocal output and performance. First, the dissertation introduces and identifies characteristics of the exceptional voice. This involves a new vocal continuum that includes the normal voice, the trained voice, and the exceptional voice. The second component is a qualitative study of older contemporary commercial music (CCM) singers adjustments and accommodations associated with their aging. From this, four overarching themes are identified: modest self-perception of their vocal prowess and its relationship to performance, acute sensitivity to changes in vocal quality, recognition of the critical association of voice quality with their identity as a performer, and an array of accommodations to aging-related vocal changes. The third component of the dissertation is a randomized control trial examining the efficacy of Vocal Function Exercises as a treatment modality for presbyphonia. Analysis revealed that the experimental group improved in select outcome measures including decreased glottic gap, increased upper range, and maximum phonation time at the 6-week post-treatment re-evaluation with no such changes in the control group. The final study investigated the vocal and performing trajectories of six CCM male singers through analysis of video performances across their career. Singers were shown to make accommodations consistent with the reported findings from component two. Such accommodations include decreased total time singing for some singers, accommodations for range changes, and changes to performance. From this study, the Exceptional Voice Protocol was created to provide a customized vocal and performance blueprint for each artist that meets their unique needs for their exceptional voices. Overall, this research indicates that aging CCM singers appear to be experiencing age and performance related vocal changes and are making detectable accommodations to their performance. Additional findings show that Vocal Function Exercises appear to be an efficacious treatment modality for aging voice. Findings from these studies confirm the need for continued research on age-related vocal and performance changes for these performers and guidelines for appropriate habilitation and rehabilitation so these rock stars can continue performing for as long as they desire

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    A study of voice quality in a group of irradiated laryngeal cancer patients tumour stages T1 and T2.

    Get PDF
    This is a longitudinal study of voice quality in a group of 35 patients irradiated for early vocal fold tumours, stages T1 and T2. Electrolaryngograph (ELG) based analyses were used to obtain objective measurements of speaking fundamental frequency parameters over a wide range of time intervals following radiotherapy. Lx waveforms were also analysed. Perceptual evaluation of voice quality and patients' self assessments of their experience of vocal symptoms and limitations in vocal function after radiotherapy, were carried out. The relationship between perceptual and self assessment parameters and objective voice quality measurements was determined. A few patients underwent periods of voice therapy. A comparison is made of their voice measurements before and after therapy intervention with a group of patients, who did not receive voice therapy. The findings in this study show that, contrary to some early reports that the voice returns to normal in the majority of patients after radiotherapy, most patients' show evidence of residual abnormal voice quality and symptoms as measured and as rated by clinicians and by patients themselves. The majority of patients do not consider these a major problem, however. Evidence is presented of the beneficial effect of voice therapy to help patients compensate for the inevitable tissue damage caused by radiotherapy to the larynx. Electrolaryngograph generated objective measures and Lx waveforms proved sensitive, reliable and clinically applicable for objective voice analysis

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Pan European Voice Conference - PEVOC 11

    Get PDF
    The Pan European VOice Conference (PEVOC) was born in 1995 and therefore in 2015 it celebrates the 20th anniversary of its establishment: an important milestone that clearly expresses the strength and interest of the scientific community for the topics of this conference. The most significant themes of PEVOC are singing pedagogy and art, but also occupational voice disorders, neurology, rehabilitation, image and video analysis. PEVOC takes place in different European cities every two years (www.pevoc.org). The PEVOC 11 conference includes a symposium of the Collegium Medicorum Theatri (www.comet collegium.com
    corecore