10 research outputs found

    Automatic quantification of vocal cord paralysis - an application of fibre-optic endoscopy video processing

    Get PDF
    Full movement of the vocal cords is necessary for life sustaining functions. To enable correct diagnosis of reduced vocal cord motion and thereby potentially enhance treatment outcomes, it is proposed to objectively determine the degree of vocal cord paralysis in contrast to the current clinical practice of subjective evaluation. Our study shows that quantitative assessment can be achieved using optical flow based motion estimation of the opening and closing movements of the vocal cords. The novelty of the proposed method lies in the automatic processing of fibre-optic endoscopy videos to derive an objective measure for the degree of paralysis, without the need for high-end data acquisition systems such as high speed cameras or stroboscopy. Initial studies with three video samples yield promising results and encourage further investigation of vocal cord paralysis using this technique

    Assessment of vocal folds phonation by means of computer analysis of laryngovideostroboscopic images – a pilot study

    Get PDF
    Wprowadzenie. Komputerowe techniki analizy obrazów umożliwiają wprowadzenie nowych metod obrazów głośni podczas fonacji oraz wyznaczenie obiektywnych parametrów oceny drgań fałdów głosowych, wspomagających lekarza laryngologa/foniatrę w bardziej precyzyjnej diagnostyce narządu głosu. Cel pracy. Zastosowanie algorytmów analizy obrazów do jakościowego i ilościowego opisu drgań fonacyjnych fałdów głosowych. Materiał i metody. Badania wideostroboskopowe głośni przeprowadzono u 15 osób: 5 pacjentów ze stwierdzonymi guzkami głosowymi, 5 pacjentów z niedomykalnością głośni oraz 5 osób z głosem prawidłowym. Zastosowano algorytmy cyfrowego przetwarzania oraz segmentacji obrazów. Wyznaczono sygnały pola światła głośni dla kolejnych cykli fonacji oraz zbudowano glottowibrogramy stanowiące przestrzenno-czasowe zobrazowanie drgań fałdów głosowych. Wyniki. Wyznaczono parametry geometryczne światła głośni dla każdego obrazu sekwencji wideostroboskopowej. Obliczono uśrednione profile szerokości światła głośni w fazie zamknięcia cyklu fonacyjnego dla poszczególnych grup badanych pacjentów. Wnioski. W pilotażowych badaniach pacjentów potwierdzono przydatność opracowanych metod analizy obrazów w precyzyjnym obrazowaniu i ocenie ilościowej drgań fonacyjnych fałdów głosowych na podstawie filmów wideostroboskopowych.Introduction. Medical imaging techniques enable determination of novel visualisation modalities of the vocal folds during phonation and definition of parameters that can aid the otolaryngologist/phoniatrician in a more precise diagnosis of voice disorders. Aim. Application of computer vision algorithms for qualitative and quantitative analysis of vocal-folds phonation vibrations. Materials and methods. Videostroboscopic examinations of the glottis were carried out for 15 individuals divided into 3 groups including five subjects each: with diagnosed nodules, with glottal insufficiency, and with no voice disorders. Image pre-processing and image segmentation algorithms were applied. Signals of the glottis area for consecutive phonation cycles were derived. Glottovibrograms were also built which facilitate spatio-temporal visualisation of the vibrating vocal folds. Results. The geometric parameters of the glottis area for each image in the stroboscopic video have been determined. The average width profiles of the glottis area for the closure phase of the glottal cycle have been computed for each group of the examined patients. Conclusions. The conducted pilot study has confirmed that computer aided imaging methods could be applied in the qualitative and quantitative analysis of the videostroboscopic images showing the phonatory motions of the vocal folds

    A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech

    Get PDF
    Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the “Rainbow Passage.” The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech

    Vocal Fold Analysis From High Speed Videoendoscopic Data

    Get PDF
    High speed videoendoscopy (HSV) of the larynx far surpasses the limits of videostroboscopy in evaluating the vocal fold vibratory behavior by providing much higher frame rate. HSV enables the visualization of vocal fold vibratory pattern within an actual glottic cycle. This very detailed infor-mation on vocal fold vibratory characteristics could provide valuable information for the assessment of vocal fold vibratory function in disordered voices and the treatments effects of the behavioral, medical and surgical treatment procedures. In this work, we aim at addressing the problem of classi-fying voice disorders with varying etiology by following four steps described shortly. Our method-ology starts with glottis segmentation. Given a HSV data, the contour of the glottal opening area in each frame should be acquired. These contours record the vibration track of the vocal fold. After this, we obtain a reliable glottal axis that is necessary for getting certain vibratory features. The third step is the feature extraction on HSV data. In the last step, we complete the classification based on the features obtained from step 3. In this study, we first propose a novel glottis segmentation method based on simplified dynam-ic programming, which proves to be efficient and accurate. In addition, we introduce a new ap-proach for calculating the glottal axis. By comparing the proposed glottal axis determination meth-ods (modified linear regression) against state-of-the-art techniques, we demonstrate that our tech-nique is more reliable. After that, the concentration shifts to feature extraction and classification schemes. Eighteen different features are extracted and their discrimination is evaluated based on principal component analysis. Support vector machine and neural network are implemented to achieve the classification among three different types of vocal folds(normal vocal fold, unilateral vocal fold polyp, and unilateral vocal fold paralysis). The result demonstrates that the classification rates of four different tasks are all above 80%

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    La voix humaine : vibrations, résonances, interactions pneumo-phono-résonantielles

    No full text
    L'humain combine des gestes respiratoires, des gestes phonatoires et des gestes articulatoires pour produire des sons qui lui permettront de s'exprimer et de communiquer avec son environnement. Depuis près d'un demi-siècle, la voix humaine est modélisée par la théorie source-filtre, qui a fait ses preuves dans des domaines aussi divers que le traitement de la parole, l'analyse-synthèse de la voix et la reconnaissance vocale. Cette théorie montre néanmoins ses limites quand l'attention est portée sur la qualité vocale, le naturel dans la synthèse, le forçage vocal, la dysphonie et le développement de troubles de la voix d'origine fonctionnelle. Nos recherches visent à compléter ce cadre théorique afin qu'il permette d'appréhender l'ensemble des gestes vocaux de la parole, en particulier les gestes liés à un effort vocal ou à la modifcation de qualité vocale. Il y a là un verrou théorique important : comment les phénomènes d'interactions pneumo-phono-résonantielles peuvent-ils être intégrés dans la théorie source-filtre de production de la voix humaine ? Par nos travaux de recherche, nous proposons des pistes de réflexion surla façon de les prendre en compte dans le cadre théorique existant

    La voix humaine : vibrations, résonances, interactions pneumo-phono-résonantielles

    Get PDF
    L'humain combine des gestes respiratoires, des gestes phonatoires et des gestes articulatoires pour produire des sons qui lui permettront de s'exprimer et de communiquer avec son environnement. Depuis près d'un demi-siècle, la voix humaine est modélisée par la théorie source-filtre, qui a fait ses preuves dans des domaines aussi divers que le traitement de la parole, l'analyse-synthèse de la voix et la reconnaissance vocale. Cette théorie montre néanmoins ses limites quand l'attention est portée sur la qualité vocale, le naturel dans la synthèse, le forçage vocal, la dysphonie et le développement de troubles de la voix d'origine fonctionnelle. Nos recherches visent à compléter ce cadre théorique afin qu'il permette d'appréhender l'ensemble des gestes vocaux de la parole, en particulier les gestes liés à un effort vocal ou à la modifcation de qualité vocale. Il y a là un verrou théorique important : comment les phénomènes d'interactions pneumo-phono-résonantielles peuvent-ils être intégrés dans la théorie source-filtre de production de la voix humaine ? Par nos travaux de recherche, nous proposons des pistes de réflexion surla façon de les prendre en compte dans le cadre théorique existant

    Automatic glottal segmentation using local-based active contours and application to glottovibrography

    No full text
    International audienceThe use of high-speed videoendoscopy (HSV) for the assessment of vocal-fold vibrations dictates the development of efficient techniques for glottal image segmentation. We present a new glottal segmentation method using a local-based active contour framework. The use of local-based features and the exploitation of the vibratory pattern allows for dealing effectively with image noise and cases where the glottal area consists of multiple regions. A scheme for precise glottis localization is introduced, which facilitates the segmentation procedure. The method has been tested on a database of 60 HSV recordings. Comparisons with manual verification resulted in less than 1% difference on the average glottal area. These errors mainly come from detection failure in the posterior or anterior parts of the glottal area. Comparisons with automatic threshold-based glottal detection point out the necessity of complete frameworks for automatic detection. The glottovibrogram (GVG), a representation of glottal vibration is also presented. This easily readable representation depicts the time-varying distance of the vocal-fold edges

    Examination tools for the endoscopic evaluation of the laryngeal adductor reflex

    Get PDF
    Der gesunde, menschliche Kehlkopf schützt die tieferen Atemwege durch reflexhafte Mechanismen vor dem Eindringen von Partikeln, der sogenannten Aspiration. Einer dieser Mechanismen ist der laryngeale Adduktionsreflex (LAR), der eine rasche Zusammenführung der Stimmlippen bewirkt. Störungen des LAR können zu einer erhöhten Aspirationswahrscheinlichkeit führen – ein Risikofaktor für eine potentiell lebensbedrohliche Lungenentzündung. Ein Routinescreening des LAR bei Verdacht auf einen pathologischen Reflexablauf ist daher medizinisch sinnvoll. Bisherige LAR-Evaluationsverfahren beruhen jedoch auf invasiven, nutzerabhängigen und/oder ungezielten Methoden. Die Reflexperformance wird bislang zudem hauptsächlich qualitativ bewertet. Zur Reduktion der genannten Nachteile wurde an der Medizinischen Hochschule Hannover ein alternatives Verfahren entwickelt und initial erprobt. Dieser Microdroplet Impulse Testing of the LAR (MIT-LAR) genannte Ansatz beruht auf dem Beschuss der Larynxschleimhaut mit einem Tröpfchen. Durch Nutzung eines Hochgeschwindigkeitslaryngoskopsystems und manuelle Auswertung der gewonnenen Bildsequenzen konnte die LAR-Latenz bei Testpersonen mit hoher zeitlicher Auflösung gemessen werden. Obgleich dieses MIT-LAR-System einen Fortschritt gegenüber vorherigen Verfahren darstellt, weist es hinsichtlich der Reproduzierbarkeit der LAR-Auslösung sowie hinsichtlich der Objektivität der optischen LAR-Analyse weiteres Optimierungspotential auf. Sowohl die tropfenvermittelte Stimulation als auch die optische Analyse des LAR werden in der vorliegenden, interdisziplinären Arbeit adressiert: Ein neuartiger Tropfenapplikator ermöglicht die Bildung eines stabilen Stimulationströpfchens mit variabler Mündungsenergie. Eine histologische Analyse des Läsionspotentials an Schweinekehlköpfen ergibt keinen Hinweis auf Gewebeschäden. Zwei stereoskopische Hochgeschwindigkeitslaryngoskope werden konzipiert und aufgebaut. In Kombination mit dem Tropfenapplikator und einem Algorithmus zur Approximation der Tropfenflugbahn ermöglichen diese die Vorhersage des Tropfenaufprallortes. Bei Verwendung eines stablinsen- bzw. bildleiterbasierten Systems werden im Labor Vorhersagefehler von (0,9 ± 0,6) mm bzw. (1,3 ± 0,8) mm gemessen. Abschließend wird ein Verfahren zur automatisierten Analyse von MIT-LAR-Sequenzen entwickelt und an einem Datensatz erprobt. Dies führt zur erstmaligen, computergestützten Messung der Stimmlippen-Winkelgeschwindigkeit während der Adduktionsphase des menschlichen LAR. Im Fall einer vollständigen bzw.~unvollständigen Adduktion werden Werte von (891 ± 516) °/s bzw. (421 ± 221) °/s erhalten. Dies stellt eine Erweiterung des medizinischen Wissensstandes dar.Several reflexive mechanisms in the human larynx protect the deeper respiratory tract from the intrusion of foreign particles, the so-called aspiration. The laryngeal adductor reflex (LAR), which leads to a rapid closure of the glottis, is one of these mechanisms. In consequence, disturbances of the LAR can lead to aspiration – a risk factor for potentially fatal pneumonia. Therefore, a routine screening of the LAR is highly beneficial in cases where a pathological reflex phenotype is suspected. Current LAR evaluation approaches rely on invasive, user-dependent, and/or untargeted methods. Moreover, the reflex performance is currently mainly being assessed qualitatively. To mitigate these disadvantages, an alternative method has recently been developed and initially tested at Hannover Medical School. This method, referred to as Microdroplet Impulse Testing of the LAR (MIT-LAR), is based on impacting the laryngeal mucosa with a droplet. By using a high-speed laryngoscope, combined with a manual analysis of the recorded high-speed sequence showing the reflexive response, the LAR onset latency could be measured at a high temporal resolution. Although the MIT-LAR system represents a technological progress with respect to prior methods, it still offers further potential for development regarding the reproducibility of LAR stimulation and the objectivity of LAR evaluation. Both droplet-based LAR stimulation and optical LAR analysis are in the focus of the present, interdisciplinary work: A novel droplet applicator module enables stabilization of droplet formation and droplet muzzle energy control. A histological analysis of the droplet’s lesion potential on porcine larynges does not yield any sign of tissue damage. Two stereoscopic high-speed laryngoscopes are designed and set up. In combination with the droplet applicator and an algorithm for the approximation of the droplet trajectory, this enables the prediction of the droplet impact site. The prediction error of both laryngoscopic systems is evaluated in a laboratory setting. A value of (0.9 0.6)mm is measured using a rod lens-based system; a fiber-based optics yields a value of (1.3 0.8)mm. Finally, a method for the automatic analysis of MIT-LAR sequences is developed and tested on a data set. This leads to the first computer-assisted measurement of the angular velocity of the vocal folds during the adduction phase of the human LAR. When complete/incomplete adduction is achieved, values of (891 516) ° s−1 and (421 221) ° s−1 are obtained, respectively. This constitutes an expansion of the state of medical knowledg
    corecore