2,718 research outputs found

    Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics

    Get PDF
    In this paper, we propose to quantify the quality of the recorded voice through objective nonlinear measures. Quantification of speech signal quality has been traditionally carried out with linear techniques since the classical model of voice production is a linear approximation. Nevertheless, nonlinear behaviors in the voice production process have been shown. This paper studies the usefulness of six nonlinear chaotic measures based on nonlinear dynamics theory in the discrimination between two levels of voice quality: healthy and pathological. The studied measures are first- and second-order Renyi entropies, the correlation entropy and the correlation dimension. These measures were obtained from the speech signal in the phase-space domain. The values of the first minimum of mutual information function and Shannon entropy were also studied. Two databases were used to assess the usefulness of the measures: a multiquality database composed of four levels of voice quality (healthy voice and three levels of pathological voice); and a commercial database (MEEI Voice Disorders) composed of two levels of voice quality (healthy and pathological voices). A classifier based on standard neural networks was implemented in order to evaluate the measures proposed. Global success rates of 82.47% (multiquality database) and 99.69% (commercial database) were obtained.Publicad

    Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

    Get PDF
    Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.
&#xa

    Analysis of complexity and modulation spectra parameterizations to characterize voice roughness

    Get PDF
    Disordered voices are frequently assessed by speech pathologists using acoustic perceptual evaluations. This might lead to problems due to the subjective nature of the process and due to the in uence of external factors which compromise the quality of the assessment. In order to increase the reliability of the evaluations the design of new indicator parameters obtained from voice signal processing is desirable. With that in mind, this paper presents an automatic evaluation system which emulates perceptual assessments of the roughness level in human voice. Two parameterization methods are used: complexity, which has already been used successfully in previous works, and modulation spectra. For the latter, a new group of parameters has been proposed as Low Modulation Ratio (LMR), Contrast (MSW) and Homogeneity (MSH). The tested methodology also employs PCA and LDA to reduce the dimensionality of the feature space, and GMM classiffers for evaluating the ability of the proposed features on distinguishing the different roughness levels. An effciency of 82% and a Cohen's Kappa Index of 0:73 is obtained using the modulation spectra parameters, while the complexity parameters performed 73% and 0:58 respectively. The obtained results indicate the usefulness of the proposed modulation spectra features for the automatic evaluation of voice roughness which can derive in new parameters to be useful for clinicians

    Analysis and Detection of Pathological Voice using Glottal Source Features

    Full text link
    Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximate glottal source signals computed with the zero frequency filtering (ZFF) method, and using acoustic voice signals directly. In addition, we propose to derive mel-frequency cepstral coefficients (MFCCs) from the glottal source waveforms computed by QCP and ZFF to effectively capture the variations in glottal source spectra of pathological voice. Experiments were carried out using two databases, the Hospital Universitario Principe de Asturias (HUPA) database and the Saarbrucken Voice Disorders (SVD) database. Analysis of features revealed that the glottal source contains information that discriminates normal and pathological voice. Pathology detection experiments were carried out using support vector machine (SVM). From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features. The best detection performance was achieved when the glottal source features were combined with the conventional MFCCs and PLP features, which indicates the complementary nature of the features

    Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease

    Get PDF
    There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD

    Towards vocal-behaviour and vocal-health assessment using distributions of acoustic parameters

    Get PDF
    Voice disorders at different levels are affecting those professional categories that make use of voice in a sustained way and for prolonged periods of time, the so-called occupational voice users. In-field voice monitoring is needed to investigate voice behaviour and vocal health status during everyday activities and to highlight work-related risk factors. The overall aim of this thesis is to contribute to the identification of tools, procedures and requirements related to the voice acoustic analysis as objective measure to prevent voice disorders, but also to assess them and furnish proof of outcomes during voice therapy. The first part of this thesis includes studies on vocal-load related parameters. Experiments were performed both in-field and in laboratory. A one-school year longitudinal study of teachers’ voice use during working hours was performed in high school classrooms using a voice analyzer equipped with a contact sensor; further measurements took place in the semi-anechoic and reverberant rooms of the National Institute of Metrological Research (I.N.Ri.M.) in Torino (Italy) for investigating the effects of very low and excessive reverberation in speech intensity, using both microphones in air and contact sensors. Within this framework, the contributions of the sound pressure level (SPL) uncertainty estimation using different devices were also assessed with proper experiments. Teachers adjusted their voice significantly with noise and reverberation, both at the beginning and at the end of the school year. Moreover, teachers who worked in the worst acoustic conditions showed higher SPLs and a worse vocal health status at the end of the school year. The minimum value of speech SPL was found for teachers in classrooms with a reverberation time of about 0.8 s. Participants involved into the in-laboratory experiments significantly increased their speech intensity of about 2.0 dB in the semi-anechoic room compared with the reverberant room, when describing a map. Such results are related to the speech monitorings performed with the vocal analyzer, whose uncertainty estimation for SPL differences resulted of about 1 dB. The second part of this thesis was addressed to vocal health and voice quality assessment using different speech materials and devices. Experiments were performed in clinics, in collaboration with the Department of Surgical Sciences of Università di Torino (Italy) and the Department of Clinical Science, Intervention and Technology of Karolinska Institutet in Stockholm (Sweden). Individual distributions of Cepstral Peak Prominence Smoothed (CPPS) from voluntary patients and control subjects were investigated in sustained vowels, reading, free speech and excerpted vowels from continuous speech, which were acquired with microphones in air and contact sensors. The main influence quantities of the estimated cepstral parameters were also identified, which are the fundamental frequency of the vocalization and the broadband noise superimposed to the signal. In addition, the reliability of CPPS estimation with respect to the frequency content of the vocal spectrum was evaluated, which is mainly dependent on the bandwidth of the measuring chain used to acquire the vocal signal. Regarding the speech materials acquired with the microphone in air, the 5th percentile resulted the best statistic for CPPS distributions that can discriminate healthy and unhealthy voices in sustained vowels, while the 95th percentile was the best in both reading and free speech tasks. The discrimination thresholds were 15 dB (95\% Confidence Interval, CI, of 0.7 dB) and 18 dB (95\% CI of 0.6 dB), respectively, where lower values indicate a high probability to have unhealthy voice. Preliminary outcomes on excerpted vowels from continuous speech stated that a CPPS mean value lower than 14 dB designates pathological voices. CPPS distributions were also effective as proof of outcomes after interventions, e.g. voice therapy and phonosurgery. Concerning the speech materials acquired with the electret contact sensor, a reasonable discrimination power was only obtained in the case of sustained vowel, where the standard deviation of CPPS distribution higher than 1.1 dB (95\% CI of 0.2 dB) indicates a high probability to have unhealthy voice. Further results indicated that a reliable estimation of CPPS parameters is obtained provided that the frequency content of the spectrum is not lower than 5 kHz: such outcome provides a guideline on the bandwidth of the measuring chain used to acquire the vocal signal

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Is the timed-up and go test feasible in mobile devices? A systematic review

    Get PDF
    The number of older adults is increasing worldwide, and it is expected that by 2050 over 2 billion individuals will be more than 60 years old. Older adults are exposed to numerous pathological problems such as Parkinson’s disease, amyotrophic lateral sclerosis, post-stroke, and orthopedic disturbances. Several physiotherapy methods that involve measurement of movements, such as the Timed-Up and Go test, can be done to support efficient and effective evaluation of pathological symptoms and promotion of health and well-being. In this systematic review, the authors aim to determine how the inertial sensors embedded in mobile devices are employed for the measurement of the different parameters involved in the Timed-Up and Go test. The main contribution of this paper consists of the identification of the different studies that utilize the sensors available in mobile devices for the measurement of the results of the Timed-Up and Go test. The results show that mobile devices embedded motion sensors can be used for these types of studies and the most commonly used sensors are the magnetometer, accelerometer, and gyroscope available in off-the-shelf smartphones. The features analyzed in this paper are categorized as quantitative, quantitative + statistic, dynamic balance, gait properties, state transitions, and raw statistics. These features utilize the accelerometer and gyroscope sensors and facilitate recognition of daily activities, accidents such as falling, some diseases, as well as the measurement of the subject's performance during the test execution.info:eu-repo/semantics/publishedVersio
    • …
    corecore