14 research outputs found

    Normal Development of Voice

    Get PDF
    This fully revised and extended second edition provides a comprehensive, most up-to-date overview of the investigation of quantitative measurement in the complex of voice. Important objective parameters of normal voice development are assessed, especially relevant when pathological deviations have to be recognized and defined. The description of different qualities of normal voice development in terms of measurable parameters is provided. The book highlights the hormonal changes that have a considerable influence on the physical development of boys and girls, and how it is possible to predict the voice transition statistically. The extent to which hormones affect voice development in the two genders are made clear in this work through the observation of a number of parameters. In this second edition, the focus is extended to include High-Speed Video images and further discussion. Possible interesting topics for further research are also emphasized. This book will be a valuable resource for laryngologists, phoniatricians, and teachers in their daily work. This is an open access book. ; The technical measurement of individual parameters in an area as complex as music and song has achieved acceptance only in recent years. However important objective parameters of normal voice development may be, they are especially so when patholo- cal deviations have to be recognised and defined. It is nevertheless also possible to a certain extent to describe different qualities of normal voice development in terms of measurable parameters. Hormonal changes have a considerable influence on the ph- ical and mental development of boys and girls. The extent to which this influence affects voice development in the two sexes will be made clear in this work through the observation of a number of parameters. I hope that this will stimulate further investigations of this topic. Possible interesting topics for further research are emphasised in the text. Working with adolescents and documenting their vocal dev- opment has given me a lot of pleasure. Colleagues with different medical specialities have supported me in this task. The practical significance of this work has shown itself in the way the results obtained (the graphs and tables) are used today by laryngo- gists, phoniatricians and music teachers in their daily work, and vii viii Preface the determination of hormonal levels in the course of puberty has been introduced as a routine in choirs

    A single latent channel is sufficient for biomedical glottis segmentation

    Get PDF
    Glottis segmentation is a crucial step to quantify endoscopic footage in laryngeal high-speed videoendoscopy. Recent advances in deep neural networks for glottis segmentation allow for a fully automatic workflow. However, exact knowledge of integral parts of these deep segmentation networks remains unknown, and understanding the inner workings is crucial for acceptance in clinical practice. Here, we show that a single latent channel as a bottleneck layer is sufficient for glottal area segmentation using systematic ablations. We further demonstrate that the latent space is an abstraction of the glottal area segmentation relying on three spatially defined pixel subtypes allowing for a transparent interpretation. We further provide evidence that the latent space is highly correlated with the glottal area waveform, can be encoded with four bits, and decoded using lean decoders while maintaining a high reconstruction accuracy. Our findings suggest that glottis segmentation is a task that can be highly optimized to gain very efficient and explainable deep neural networks, important for application in the clinic. In the future, we believe that online deep learning-assisted monitoring is a game-changer in laryngeal examinations

    A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech

    Get PDF
    Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the “Rainbow Passage.” The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Rethinking glottal midline detection

    Get PDF
    A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Cervical Auscultation for the Identification of Swallowing Difficulties

    Get PDF
    Swallowing difficulties, commonly referred to as dysphagia, affect thousands of Americans every year. They have a multitude of causes, but in general they are known to increase the risk of aspiration when swallowing in addition to other physiological effects. Cervical auscultation has been recently applied to detect such difficulties non-invasively and various techniques for analysis and processing of the recorded signals have been proposed. We attempted to further this research in three key areas. First, we characterized swallows with regards to a multitude of time, frequency, and time-frequency features while paying special attention to the differences between swallows from healthy adults and safe dysphagic swallows as well as safe and unsafe dysphagic swallows. Second, we attempted to utilize deep belief networks in order to classify these states automatically and without the aid of a concurrent videofluoroscopic examination. Finally, we sought to improve some of the signal processing techniques used in this field. We both implemented the DBSCAN algorithm to better segment our physiological signals as well as applied the matched complex wavelet transform to cervical auscultation data in order to improve its quality for mathematical analysis
    corecore