51 research outputs found

    Automated tracking of quantitative parameters from single line scanning of vocal folds: A case study of the 'messa di voce' exercise

    Get PDF
    This article presents a novel application of the 'single line scanning' of the vocal fold vibrations (kymography) in singing pedagogy, particularly in a specific technical voice exercise: the 'messa di voce'. It aims at giving the singer relevant and valid short-term feedback. A user-friendly automatic analysis program makes possible a precise, immediate quantification of the essential physiological parameters characterizing the changes in glottal impedance, concomitant with the progressive increase and decrease of the lung pressure. The data provided by the program show a strong correlation with the hand-made measurements. Additional measurements such as subglottic pressure and flow glottography by inverse filtering can be meaningfully correlated with the data obtained from the kymographic images

    Segmentation of the glottal space from laryngeal images using the watershed transform

    Full text link
    The present work describes a new method for the automatic detection of the glottal space from laryngeal images obtained either with high speed or with conventional video cameras attached to a laryngoscope. The detection is based on the combination of several relevant techniques in the field of digital image processing. The image is segmented with a watershed transform followed by a region merging, while the final decision is taken using a simple linear predictor. This scheme has successfully segmented the glottal space in all the test images used. The method presented can be considered a generalist approach for the segmentation of the glottal space because, in contrast with other methods found in literature, this approach does not need either initialization or finding strict environmental conditions extracted from the images to be processed. Therefore, the main advantage is that the user does not have to outline the region of interest with a mouse click. In any case, some a priori knowledge about the glottal space is needed, but this a priori knowledge can be considered weak compared to the environmental conditions fixed in former works

    Detección del espacio glotal en imágenes laríngeas mediante transformada Watershed y Merging JND

    Get PDF
    El presente artículo describe un nuevo método para la detección del espacio glotal en imágenes laríngeas obtenidas de vídeos de alta o baja velocidad. El proceso de detección basa su eficacia en la combinación de varias técnicas de gran relevancia en el campo del tratamiento digital de imágenes. Una de estas técnicas es la transformada Watershed que junto con varios tipos de Merging y un proceso final de predicción lineal, hacen posible la detección automática en un 99% de las imágenes analizadas. La potencia del método se ve incrementada por la ausencia de cualquier tipo de inicialización y por no necesitar condiciones estrictas sobre las características de las imágenes a procesar. Evidentemente es importante que el algoritmo integre información a priori del espacio glotal, pero este conocimiento es bastante relajado comparado con las condiciones impuestas por otros trabajos que también intentan la segmentación

    Automatic quantification of vocal cord paralysis - an application of fibre-optic endoscopy video processing

    Get PDF
    Full movement of the vocal cords is necessary for life sustaining functions. To enable correct diagnosis of reduced vocal cord motion and thereby potentially enhance treatment outcomes, it is proposed to objectively determine the degree of vocal cord paralysis in contrast to the current clinical practice of subjective evaluation. Our study shows that quantitative assessment can be achieved using optical flow based motion estimation of the opening and closing movements of the vocal cords. The novelty of the proposed method lies in the automatic processing of fibre-optic endoscopy videos to derive an objective measure for the degree of paralysis, without the need for high-end data acquisition systems such as high speed cameras or stroboscopy. Initial studies with three video samples yield promising results and encourage further investigation of vocal cord paralysis using this technique

    Automatic Segmentation of Glottal Space from Video Images Based on Mathematical Morphology and the hough Transform

    Get PDF
    Vocal disorders directly arise from the physical shape of the vocal cords. Videostroboscopic imaging provides doctors with valuable information about the physical shape of the vocal cords and about the way these cords move. Segmentation of the glottal space is necessary in order to characterize morphological disorders of vocal folds. One of the main problems with the methods presented is their low level of accuracy. To solve this problem, an automatic method based on Mathematical Morphology edge detection and the Hough transformation is presented in this article to extract the glottal space from the videostroboscopic images presented. Our method compared with the histogram and active contours methods and the findings showed that our proposed method yields better results.DOI:http://dx.doi.org/10.11591/ijece.v2i2.21

    Beyond writing: The development of literacy in the Ancient Near East

    Get PDF
    Previous discussions of the origins of writing in the Ancient Near East have not incorporated the neuroscience of literacy, which suggests that when southern Mesopotamians wrote marks on clay in the late-fourth millennium, they inadvertently reorganized their neural activity, a factor in manipulating the writing system to reflect language, yielding literacy through a combination of neurofunctional change and increased script fidelity to language. Such a development appears to take place only with a sufficient demand for writing and reading, such as that posed by a state-level bureaucracy; the use of a material with suitable characteristics; and the production of marks that are conventionalized, handwritten, simple, and non-numerical. From the perspective of Material Engagement Theory, writing and reading represent the interactivity of bodies, materiality, and brains: movements of hands, arms, and eyes; clay and the implements used to mark it and form characters; and vision, motor planning, object recognition, and language. Literacy is a cognitive change that emerges from and depends upon the nexus of interactivity of the components

    Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings

    No full text
    International audiencePrevious research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of 'zippering' closure along the anterior-posterior (A-P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24-10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A-P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A-P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A-P phase differences

    ANALYSIS OF VOCAL FOLD KINEMATICS USING HIGH SPEED VIDEO

    Get PDF
    Vocal folds are the twin in-folding of the mucous membrane stretched horizontally across the larynx. They vibrate modulating the constant air flow initiated from the lungs. The pulsating pressure wave blowing through the glottis is thus the source for voiced speech production. Study of vocal fold dynamics during voicing are critical for the treatment of voice pathologies. Since the vocal folds move at 100 - 350 cycles per second, their visual inspection is currently done by strobosocopy which merges information from multiple cycles to present an apparent motion. High Speed Digital Laryngeal Imaging(HSDLI) with a temporal resolution of up to 10,000 frames per second has been established as better suited for assessing the vocal fold vibratory function through direct recording. But the widespread use of HSDLI is limited due to lack of consensus on the modalities like features to be examined. Development of the image processing techniques which circumvents the need for the tedious and time consuming effort of examining large volumes of recording has room for improvement. Fundamental questions like the required frame rate or resolution for the recordings is still not adequately answered. HSDLI cannot get the absolute physical measurement of the anatomical features and vocal fold displacement. This work addresses these challenges through improved signal processing. A vocal fold edge extraction technique with subpixel accuracy, suited even for hard to record pediatric population is developed first. The algorithm which is equally applicable for pediatric and adult subjects, is implemented to facilitate user inspection and intervention. Objective features describing the fold dynamics, which are extracted from the edge displacement waveform are proposed and analyzed on a diverse dataset of healthy males, females and children. The sampling and quantization noise present in the recordings are analyzed and methods to mitigate them are investigated. A customized Kalman smoothing and spline interpolation on the displacement waveform is found to improve the feature estimation stability. The relationship between frame rate, spatial resolution and vibration for efficient capturing of information is derived. Finally, to address the inability to measure physical measurement, a structured light projection calibrated with respect to the endoscope is prototyped
    • …
    corecore