71 research outputs found

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Estimating tremor in Vocal Fold Biomechanics for Neurological Disease characterisation

    Get PDF
    Neurological Diseases (ND) are affecting larger segments of aging population every year. Treatment is dependent on expensive accurate and frequent monitoring. It is well known that ND leave correlates in speech and phonation. The present work shows a method to detect alterations in vocal fold tension during phonation. These may appear either as hypertension or as cyclical tremor. Estimations of tremor may be produced by auto-regressive modeling of the vocal fold tension series in sustained phonation. The correlates obtained are a set of cyclicality coefficients, the frequency and the root mean square amplitude of the tremor. Statistical distributions of these correlates obtained from a set of male and female subjects are presented. Results from five study cases of female voice are also given

    Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

    Get PDF
    Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.
&#xa

    Improving automatic detection of obstructive sleep apnea through nonlinear analysis of sustained speech

    Get PDF
    We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients' voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients' voices, which should be found in continuous speech

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    Accurate telemonitoring of Parkinson's disease symptom severity using nonlinear speech signal processing and statistical machine learning

    Get PDF
    This study focuses on the development of an objective, automated method to extract clinically useful information from sustained vowel phonations in the context of Parkinson’s disease (PD). The aim is twofold: (a) differentiate PD subjects from healthy controls, and (b) replicate the Unified Parkinson’s Disease Rating Scale (UPDRS) metric which provides a clinical impression of PD symptom severity. This metric spans the range 0 to 176, where 0 denotes a healthy person and 176 total disability. Currently, UPDRS assessment requires the physical presence of the subject in the clinic, is subjective relying on the clinical rater’s expertise, and logistically costly for national health systems. Hence, the practical frequency of symptom tracking is typically confined to once every several months, hindering recruitment for large-scale clinical trials and under-representing the true time scale of PD fluctuations. We develop a comprehensive framework to analyze speech signals by: (1) extracting novel, distinctive signal features, (2) using robust feature selection techniques to obtain a parsimonious subset of those features, and (3a) differentiating PD subjects from healthy controls, or (3b) determining UPDRS using powerful statistical machine learning tools. Towards this aim, we also investigate 10 existing fundamental frequency (F_0) estimation algorithms to determine the most useful algorithm for this application, and propose a novel ensemble F_0 estimation algorithm which leads to a 10% improvement in accuracy over the best individual approach. Moreover, we propose novel feature selection schemes which are shown to be very competitive against widely-used schemes which are more complex. We demonstrate that we can successfully differentiate PD subjects from healthy controls with 98.5% overall accuracy, and also provide rapid, objective, and remote replication of UPDRS assessment with clinically useful accuracy (approximately 2 UPDRS points from the clinicians’ estimates), using only simple, self-administered, and non-invasive speech tests. The findings of this study strongly support the use of speech signal analysis as an objective basis for practical clinical decision support tools in the context of PD assessment.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    Betegségek automatikus szétválasztása időben eltolt akusztikai jellemzők korrelációs struktúrája alapján

    Get PDF
    Egyes betegségtípusok különböző módon befolyásolhatják beszédképzésünk összetett mechanizmusait, patológiás beszédet eredményezve. Biomarkerek kinyerése a beszédből megbízható jelzői lehetnek a különböző betegségtípusoknak. A cikk célja egészséges és különböző betegségtípusokban szenvedő bemondók beszédmintáinak különválasztása. A vizsgált betegségtípusok a következők: depresszió, Parkinson-kór, hangképző szervek morfológiai elváltozása, a funkcionális diszfónia és a rekurrens paresis. Az osztályozó bemenetére formánsfrekvenciák (F1, F2, F3), a mel-szűrő sáv energia értékei, a mel-frekvencia kepsztrális együtthatók (MFCCs), az alapfrekvencia (F0) és az intenzitás időben eltolt értékeinek korrelációs mátrixaiból származtatott értékei kerültek. Szupport vektor gépet, valamint k-legközelebbi szomszéd osztályozási eljárásokat használtunk az eredmények összehasonlítására. Hatosztályos osztályozás esetben a legjobb osztályozási pontosság 54.8%-nak adódott, míg négyosztályos esetben 77.6%. Az elért eredmények alapján kijelenthető, hogy egy beszédalapú rendszer létrehozható, amely segít a klinikai személyzetnek a korai diagnózis felállításában

    Optimization and automation of relative fundamental frequency for objective assessment of vocal hyperfunction

    Full text link
    The project objective is to improve clinical assessment and diagnosis of the voice disorder, vocal hyperfunction (VH). VH is a condition characterized by excessive laryngeal and paralaryngeal tension, and is assumed to be the underlying cause of the majority of voice disorders. Current clinical assessment of VH is subjective and demonstrates poor inter-rater reliability. Recent work indicates that a new acoustic measure, relative fundamental frequency (RFF) is sensitive to the maladaptive functional behaviors associated with VH and can potentially be used to objectively characterize VH. Here, we explored and enhanced the potential for RFF as a measure of VH in three ways. First, the current protocol for RFF estimation was optimized to simplify the recording procedure and reduce estimation time. Second, RFF was compared with the current state-of-the-art measures of VH – listener perception of vocal effort and the aerodynamic ratio of sound pressure level to subglottal pressure level. Third, an automated algorithm that utilized the optimized recording protocol was developed and validated against manual estimation methods and listener perception. This work enables large-scale studies on RFF to determine the specific physiological elements that contribute to the measure’s ability to capture VH and may potentially provide a non-invasive and readily implemented solution for this long-standing clinical issue
    corecore