7 research outputs found

    Segmentation of the glottal space from laryngeal images using the watershed transform

    Full text link
    The present work describes a new method for the automatic detection of the glottal space from laryngeal images obtained either with high speed or with conventional video cameras attached to a laryngoscope. The detection is based on the combination of several relevant techniques in the field of digital image processing. The image is segmented with a watershed transform followed by a region merging, while the final decision is taken using a simple linear predictor. This scheme has successfully segmented the glottal space in all the test images used. The method presented can be considered a generalist approach for the segmentation of the glottal space because, in contrast with other methods found in literature, this approach does not need either initialization or finding strict environmental conditions extracted from the images to be processed. Therefore, the main advantage is that the user does not have to outline the region of interest with a mouse click. In any case, some a priori knowledge about the glottal space is needed, but this a priori knowledge can be considered weak compared to the environmental conditions fixed in former works

    Preprocesado Avanzado de Imágenes Laríngeas para Mejorar la Segmentación del Área Glotal

    Get PDF
    El presente trabajo describe un método avanzado de preprocesado de imagen para mejorar la detección automática del espacio glotal en imagines laríngeas. El sistema puede aplicarse a imágenes obtenidas a partir de exploraciones de alta velocidad o a partir de exploraciones estroboscópicas (baja velocidad), aunque es en estas últimas donde se observan las mayores ventajas, al tratarse de grabaciones de inferior calidad. Con esta nueva técnica de preprocesado se logran resolver ciertos fallos de segmentación producidos por un sistema previo basado en transformada “Watershed” y “Merging”. En resumen, se consiguen arreglar o mejorar el 38% de los errores de delineado de la glotis que aparecían en 29 imágenes de un total de 111 segmentadas

    Assessment of vocal folds phonation by means of computer analysis of laryngovideostroboscopic images – a pilot study

    Get PDF
    Wprowadzenie. Komputerowe techniki analizy obrazów umożliwiają wprowadzenie nowych metod obrazów głośni podczas fonacji oraz wyznaczenie obiektywnych parametrów oceny drgań fałdów głosowych, wspomagających lekarza laryngologa/foniatrę w bardziej precyzyjnej diagnostyce narządu głosu. Cel pracy. Zastosowanie algorytmów analizy obrazów do jakościowego i ilościowego opisu drgań fonacyjnych fałdów głosowych. Materiał i metody. Badania wideostroboskopowe głośni przeprowadzono u 15 osób: 5 pacjentów ze stwierdzonymi guzkami głosowymi, 5 pacjentów z niedomykalnością głośni oraz 5 osób z głosem prawidłowym. Zastosowano algorytmy cyfrowego przetwarzania oraz segmentacji obrazów. Wyznaczono sygnały pola światła głośni dla kolejnych cykli fonacji oraz zbudowano glottowibrogramy stanowiące przestrzenno-czasowe zobrazowanie drgań fałdów głosowych. Wyniki. Wyznaczono parametry geometryczne światła głośni dla każdego obrazu sekwencji wideostroboskopowej. Obliczono uśrednione profile szerokości światła głośni w fazie zamknięcia cyklu fonacyjnego dla poszczególnych grup badanych pacjentów. Wnioski. W pilotażowych badaniach pacjentów potwierdzono przydatność opracowanych metod analizy obrazów w precyzyjnym obrazowaniu i ocenie ilościowej drgań fonacyjnych fałdów głosowych na podstawie filmów wideostroboskopowych.Introduction. Medical imaging techniques enable determination of novel visualisation modalities of the vocal folds during phonation and definition of parameters that can aid the otolaryngologist/phoniatrician in a more precise diagnosis of voice disorders. Aim. Application of computer vision algorithms for qualitative and quantitative analysis of vocal-folds phonation vibrations. Materials and methods. Videostroboscopic examinations of the glottis were carried out for 15 individuals divided into 3 groups including five subjects each: with diagnosed nodules, with glottal insufficiency, and with no voice disorders. Image pre-processing and image segmentation algorithms were applied. Signals of the glottis area for consecutive phonation cycles were derived. Glottovibrograms were also built which facilitate spatio-temporal visualisation of the vibrating vocal folds. Results. The geometric parameters of the glottis area for each image in the stroboscopic video have been determined. The average width profiles of the glottis area for the closure phase of the glottal cycle have been computed for each group of the examined patients. Conclusions. The conducted pilot study has confirmed that computer aided imaging methods could be applied in the qualitative and quantitative analysis of the videostroboscopic images showing the phonatory motions of the vocal folds

    Review of Research on Speech Technology: Main Contributions From Spanish Research Groups

    Get PDF
    In the last two decades, there has been an important increase in research on speech technology in Spain, mainly due to a higher level of funding from European, Spanish and local institutions and also due to a growing interest in these technologies for developing new services and applications. This paper provides a review of the main areas of speech technology addressed by research groups in Spain, their main contributions in the recent years and the main focus of interest these days. This description is classified in five main areas: audio processing including speech, speaker characterization, speech and language processing, text to speech conversion and spoken language applications. This paper also introduces the Spanish Network of Speech Technologies (RTTH. Red Temática en Tecnologías del Habla) as the research network that includes almost all the researchers working in this area, presenting some figures, its objectives and its main activities developed in the last years

    Segmentación de la glotis en imágenes laríngeas usando snakes.

    Get PDF
    El presente trabajo describe una nueva metodología para la detección automática del espacio glotal de imágenes laríngeas tomadas a partir de 15 vídeos grabados por el servicio ORL del hospital Gregorio Marañón de Madrid con luz estroboscópica. El sistema desarrollado está basado en el modelo de contornos activos (snake). El algoritmo combina en el pre-procesado, algunas técnicas tradicionales (umbralización y filtro de mediana) con técnicas más sofisticadas tales como filtrado anisotrópico. De esta forma, se obtiene una imagen apropiada para el uso de las snakes. El valor escogido para el umbral es del 85% del pico máximo del histograma de la imagen; sobre este valor la información de los píxeles no es relevante. El filtro anisotrópico permite distinguir dos niveles de intensidad, uno es el fondo y el otro es la glotis. La inicialización se basa en obtener el módulo del campo GVF; de esta manera se asegura un proceso automático para la selección del contorno inicial. El rendimiento del algoritmo se valida usando los coeficientes de Pratt y se compara contra una segmentación realizada manualmente y otro método automático basado en la transformada de watershed. SUMMARY: The present work describes a new methodology for the automatic detection of the glottal space from laryngeal images taken from 15 videos recorded by the ENT service of the Gregorio Marañon Hospital in Madrid with videostroboscopic equipment. The system is based on active contour models (snakes). The algorithm combines for the pre-processing, some traditional techniques (thresholding and median filter) with more sophisticated techniques such as anisotropic filtering. In this way, we obtain an appropriate image for the use of snake. The value selected for the threshold is 85% of the maximum peak of the image histogram; over this point the information of the pixels is not relevant. The anisotropic filter permits to distinguish two intensity levels, one is the background and the other one is the glottis. The initialization is based on the obtained magnitude by GVF field; in this manner an automatic process for the initial contour selection will be assured. The performance of the algorithm is tested using the Pratt coefficient and compared against a manual segmentation and another automatic method based on the watershed transformation

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies
    corecore