886 research outputs found

    Analysis of Vocal Disorders in a Feature Space

    Full text link
    This paper provides a way to classify vocal disorders for clinical applications. This goal is achieved by means of geometric signal separation in a feature space. Typical quantities from chaos theory (like entropy, correlation dimension and first lyapunov exponent) and some conventional ones (like autocorrelation and spectral factor) are analysed and evaluated, in order to provide entries for the feature vectors. A way of quantifying the amount of disorder is proposed by means of an healthy index that measures the distance of a voice sample from the centre of mass of both healthy and sick clusters in the feature space. A successful application of the geometrical signal separation is reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering & Physic

    Detección automática de voz hipernasal de niños con labio y paladar hendido a partir de vocales y palabras del español usando medidas clásicas y análisis no lineal

    Get PDF
    RESUMEN: Este artículo presenta un sistema para la detección automática de señales de voz hipernasales basado en la combinación de dos diferentes esquemas de caracterización aplicados en las cinco vocales del español y dos palabras seleccionadas. El primer esquema está basado en características clásicas como perturbaciones del periodo fundamental, medidas de ruido y coeficientes cepstrales en la frecuencia de Mel. El segundo enfoque está basado en medidas de dinámica no lineal. Las características más relevantes son seleccionadas usando dos técnicas: análisis de componentes principales y selección flotante hacia adelante secuencial. La decisión acerca de si un registro de voz es hipernasal o sano es tomada usando una máquina de soporte vectorial de margen suave. Los experimentos consideran grabaciones de las cinco vocales del idioma español y las palabras y se consideran, asimismo, tres conjuntos de características: (1) el enfoque clásico, (2) el análisis de dinámica no lineal y (3) la combinación de ambos esquemas. En general, los aciertos son mayores y más estables cuando las características clásicas y no lineales son combinadas, indicando que el análisis de dinámica no lineal se complementa con el esquema clásico.ABSTRACT: This paper presents a system for the automatic detection of hypernasal speech signals based on the combination of two different characterization approaches applied to the five spanish vowels and two selected words. The first approach is based on classical features such as pitch period perturbations, noise measures, and Mel-Frequency Cepstral Coefficients (MFCC). The second approach is based on the Non-Linear Dynamics (NLD) analysis. The most relevant features are selected and sorted using two techniques: Principal Components Analysis (PCA) and Sequential Forward Floating Selection (SFFS). The decision about whether a voice record is hypernasal or healthy is taken using a Soft Margin - Support Vector Machine (SM-SVM). Experiments upon recordings of the five Spanish vowels and the words are performed considering three different set of features: (1) the classical approach, (2) the NLD analysis, and (3) the combination of the classical and NLD measures. In general, the accuracies are higher and more stable when the classical and NLD features are combined, indicating that the NLD analysis is complementary to the classical approach

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Intelligibility Evaluation of Pathological Speech through Multigranularity Feature Extraction and Optimization

    Get PDF
    Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S-transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S-transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F-score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation

    Analysis and Detection of Pathological Voice using Glottal Source Features

    Full text link
    Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximate glottal source signals computed with the zero frequency filtering (ZFF) method, and using acoustic voice signals directly. In addition, we propose to derive mel-frequency cepstral coefficients (MFCCs) from the glottal source waveforms computed by QCP and ZFF to effectively capture the variations in glottal source spectra of pathological voice. Experiments were carried out using two databases, the Hospital Universitario Principe de Asturias (HUPA) database and the Saarbrucken Voice Disorders (SVD) database. Analysis of features revealed that the glottal source contains information that discriminates normal and pathological voice. Pathology detection experiments were carried out using support vector machine (SVM). From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features. The best detection performance was achieved when the glottal source features were combined with the conventional MFCCs and PLP features, which indicates the complementary nature of the features

    Analysis of complexity and modulation spectra parameterizations to characterize voice roughness

    Get PDF
    Disordered voices are frequently assessed by speech pathologists using acoustic perceptual evaluations. This might lead to problems due to the subjective nature of the process and due to the in uence of external factors which compromise the quality of the assessment. In order to increase the reliability of the evaluations the design of new indicator parameters obtained from voice signal processing is desirable. With that in mind, this paper presents an automatic evaluation system which emulates perceptual assessments of the roughness level in human voice. Two parameterization methods are used: complexity, which has already been used successfully in previous works, and modulation spectra. For the latter, a new group of parameters has been proposed as Low Modulation Ratio (LMR), Contrast (MSW) and Homogeneity (MSH). The tested methodology also employs PCA and LDA to reduce the dimensionality of the feature space, and GMM classiffers for evaluating the ability of the proposed features on distinguishing the different roughness levels. An effciency of 82% and a Cohen's Kappa Index of 0:73 is obtained using the modulation spectra parameters, while the complexity parameters performed 73% and 0:58 respectively. The obtained results indicate the usefulness of the proposed modulation spectra features for the automatic evaluation of voice roughness which can derive in new parameters to be useful for clinicians

    Amplitude modulation of vowel glottal pulses: application to sleep inertia

    Get PDF
    International audienceHuman voice carries non-linguistic information about emotion, fatigue, stress, truth, psychological illnesses etc. The proofs of this are well-established nowadays. In real-life situations, in laboratory conditions and from a cross-cultural point of view, the speakers psycho-physiological disorders induce vocal modifications. Many acoustic parameters are measured. They belong to the dynamic and spectral planes. Phase space is also involved. Amplitude modulation is one of them. Unlike prosody and vocal quality features, this has not been widely studied. In this paper, a method for estimation vowel glottal pulses amplitude modulations is proposed. After pulse detection, a sinusoidal fit is applied leading to an estimate of the amplitude modulation frequency. This method has already been used in experiments on sleep inertia effects on the voice. A pilot is suddenly awakened to undertake aeronautical psychomotor tasks. Results show the existence of an amplitude modulation. Their validity is based on determination coefficient measurements taking into account the number of pitch periods. Additionally, shimmer measurements show an increase after awakening. It can thus be concluded that sleep inertia has an effect on vowels uttered by the pilot

    Nuevo dispositivo para análisis de voz de pacientes con enfermedad de Parkinson en tiempo real

    Get PDF
    RESUMEN: La enfermedad de Parkinson (EP) es un desorden neurodegenerativo que afecta la coordinación de músculos y extremidades, incluyendo aquellos responsables de la producción del habla, generando alteraciones en la inteligibilidad de la señal de voz. Está demostrado que el ejercicio terapéutico constante puede mejorar las habilidades de comunicación de los pacientes; sin embargo, el diagnóstico acerca del avance en el proceso de recuperación es realizado de forma subjetiva por los fonoaudiólogos o neurólogos. Debido a esto se requiere el desarrollo de herramientas flexibles que valoren y guíen la terapia fonoaudiológica de los pacientes. En este artículo se presenta el diseño e implementación de un sistema embebido para el análisis en tiempo real de la voz de pacientes con EP. Para esto se desarrollan tres plataformas; primero, se construye una interfaz gráfica en Matlab; luego, se crea un primer prototipo basado en un DSP TMS320C6713 de Texas Instruments. La aplicación final es desarrollada sobre un mini-ordenador que cuenta con un códec de audio, capacidad de almacenamiento, y una unidad de procesamiento. El sistema además se complementa con un monitor LCD para desplegar información en tiempo real, y un teclado para la interacción con el usuario. En todas las plataformas se evalúan diferentes medidas usadas comúnmente en la valoración de la voz de pacientes con EP, incluyendo características acústicas y de dinámica no lineal. En concordancia con otros trabajos del estado del arte donde se analiza la voz de personas con EP, la plataforma diseñada muestra un incremento en la variación del pitch en la voz de los pacientes, además de un decremento en el valor del área del espacio vocálico. Este resultado indica que la herramienta diseñada puede ser útil para hacer la evaluación y seguimiento de la terapia fonoaudiológica de pacientes con EP.ABSTRACT: Parkinson’s disease (PD) is a neurodegenerative disorder that affects the coordination of muscles and limbs, including those responsible of the speech production. The lack of control of the limbs and muscles involved in the speech production process can generate intelligibility problems and this situation has a negative impact in the social interaction of the patients. It is already demonstrated that constant speech therapy can improve the communication abilities of the patients; however, the measurement of the recovery progress is done subjectively by speech therapists and neurologists. Due to this, it is required the development of flexible tools able to asses and guide the speech therapy of the patients. In this paper the design and deployment of a new device for the real time assessment of speech signals of people with PD is presented. The processes of design and deployment include the development on three platforms: first, a graphic user interface is developed on Matlab, second the first prototype is implemented on a digital signal processor (DSP) and third, the final device is developed on a mini-computer. The device is equipped with an audio codec, storage capacity and the processing unit. Besides, the system is complemented with a monitor to display the processed information on real time and with a keyboard enabling the interaction of the end-user with the device. Different acoustics and nonlinear dynamics measures which have been used in the state of the art for the assessment of speech of people with PD are implemented on the three mentioned platforms. In accordance with the state of the art, the designed platforms show an increment in the variation of the fundamental period of speech (commonly called pitch) of people with PD. Additionally, the decrease of the vocal space area is validated for the case of patients with PD. These results indicate that the designed device is useful to perform the assessment and monitoring of the speech therapy of people with PD

    Detection of emotions in Parkinson's disease using higher order spectral features from brain's electrical activity

    Get PDF
    Non-motor symptoms in Parkinson's disease (PD) involving cognition and emotion have been progressively receiving more attention in recent times. Electroencephalogram (EEG) signals, being an activity of central nervous system, can reflect the underlying true emotional state of a person. This paper presents a computational framework for classifying PD patients compared to healthy controls (HC) using emotional information from the brain's electrical activity
    • …
    corecore