158 research outputs found

    Low band spectral tilt analysis for pathological voice discrimination

    Get PDF
    This paper presents a new method for discriminating between subjects with healthy voices and subjects with diseases in the vocal folds. This method uses speech signals and spectral analysis of the sustained vowel /a/. The slope between a first band of the signal defined in the first two harmonics and a second band defined in the zone of the /a/ first formant contains information that allows to correctly classify the database of pathological voices of the University of Sao Paulo. The presented method can be applied in the direct analysis of spectra or implemented in high-level classifiers as a complement to other parameters.info:eu-repo/semantics/publishedVersio

    Detección automática de voz hipernasal de niños con labio y paladar hendido a partir de vocales y palabras del español usando medidas clásicas y análisis no lineal

    Get PDF
    RESUMEN: Este artículo presenta un sistema para la detección automática de señales de voz hipernasales basado en la combinación de dos diferentes esquemas de caracterización aplicados en las cinco vocales del español y dos palabras seleccionadas. El primer esquema está basado en características clásicas como perturbaciones del periodo fundamental, medidas de ruido y coeficientes cepstrales en la frecuencia de Mel. El segundo enfoque está basado en medidas de dinámica no lineal. Las características más relevantes son seleccionadas usando dos técnicas: análisis de componentes principales y selección flotante hacia adelante secuencial. La decisión acerca de si un registro de voz es hipernasal o sano es tomada usando una máquina de soporte vectorial de margen suave. Los experimentos consideran grabaciones de las cinco vocales del idioma español y las palabras y se consideran, asimismo, tres conjuntos de características: (1) el enfoque clásico, (2) el análisis de dinámica no lineal y (3) la combinación de ambos esquemas. En general, los aciertos son mayores y más estables cuando las características clásicas y no lineales son combinadas, indicando que el análisis de dinámica no lineal se complementa con el esquema clásico.ABSTRACT: This paper presents a system for the automatic detection of hypernasal speech signals based on the combination of two different characterization approaches applied to the five spanish vowels and two selected words. The first approach is based on classical features such as pitch period perturbations, noise measures, and Mel-Frequency Cepstral Coefficients (MFCC). The second approach is based on the Non-Linear Dynamics (NLD) analysis. The most relevant features are selected and sorted using two techniques: Principal Components Analysis (PCA) and Sequential Forward Floating Selection (SFFS). The decision about whether a voice record is hypernasal or healthy is taken using a Soft Margin - Support Vector Machine (SM-SVM). Experiments upon recordings of the five Spanish vowels and the words are performed considering three different set of features: (1) the classical approach, (2) the NLD analysis, and (3) the combination of the classical and NLD measures. In general, the accuracies are higher and more stable when the classical and NLD features are combined, indicating that the NLD analysis is complementary to the classical approach

    Intelligibility Evaluation of Pathological Speech through Multigranularity Feature Extraction and Optimization

    Get PDF
    Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S-transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S-transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F-score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Cured database of sustained speech parameters for chronic laryngitis pathology

    Get PDF
    This paper reports the construction and organization of a database of speech parameters extracted from a speech sound database. The database is freely available on internet and the paper intends also theirs advertise for the research community. The database includes the parameters extracted from the sound of sustained vowels produced by a group of Chronic Laryngitis patients and a group of control subjects with similar characteristics concerning gender and age. The set of parameters of this database consists in the Jitter, Shimmer, Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of sustained vowels /a/, /i/ and /u/ at low, neutral and high tones.info:eu-repo/semantics/publishedVersio

    Acoustic analysis of chronic laryngitis - statistical analysis of sustained speech parameters

    Get PDF
    This paper describes the statistical analysis of a set of features extracted from the speech of sustained vowels of patients with chronic laryngitis and control subjects. The idea is to identify which features can be useful in a classification intelligent system to discriminate between pathologic and healthy voices. The set of features analysed consist in the Jitter, Shimmer Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of a sustained vowels /a/, /i/ and /u/ in a low, neutral and high tones. The results showed that besides the absolute Jitter, no statistical significance exist between male and female voices, considering the classification between pathologic or healthy. Any of the analysed parameters is likely to be a statistical difference between control and Chronic Laryngitis groups. This is an important information that these features can be used in an intelligent system to classify healthy from Chronic Laryngitis voices.info:eu-repo/semantics/publishedVersio

    Automatic acoustic analysis of waveform perturbations

    Get PDF

    Clustering of voice pathologies based on sustained voice parameters

    Get PDF
    Signal processing techniques can be used to extract information that contribute to the detection of laryngeal disorders. The goal of this paper is to perform a statistical analysis through the boxplot tool from 832 voice signals of individuals with different laryngeal pathologies from the Saarbrücken Voice Database in order to create relevant groups, making feasible an automatic identification of these dysfunctions. Jitter, Shimmer, HNR, NHR and Autocorrelation features were compared between several groups of voice pathologies/conditions, resulting in three identified clusters.info:eu-repo/semantics/publishedVersio

    Acoustic measurement of overall voice quality in sustained vowels and continuous speech

    Get PDF
    Measurement of dysphonia severity involves auditory-perceptual evaluations and acoustic analyses of sound waves. Meta-analysis of proportional associations between these two methods showed that many popular perturbation metrics and noise-to-harmonics and others ratios do not yield reasonable results. However, this meta-analysis demonstrated that the validity of specific autocorrelation- and cepstrum-based measures was much more convincing, and appointed ‘smoothed cepstral peak prominence’ as the most promising metric of dysphonia severity. Original research confirmed this inferiority of perturbation measures and superiority of cepstral indices in dysphonia measurement of laryngeal-vocal and tracheoesophageal voice samples. However, to be truly representative for daily voice use patterns, measurement of overall voice quality is ideally founded on the analysis of sustained vowels ánd continuous speech. A customized method for including both sample types and calculating the multivariate Acoustic Voice Quality Index (i.e., AVQI), was constructed for this purpose. Original study of the AVQI revealed acceptable results in terms of initial concurrent validity, diagnostic precision, internal and external cross-validity and responsiveness to change. It thus was concluded that the AVQI can track changes in dysphonia severity across the voice therapy process. There are many freely and commercially available computer programs and systems for acoustic metrics of dysphonia severity. We investigated agreements and differences between two commonly available programs (i.e., Praat and Multi-Dimensional Voice Program) and systems. The results indicated that clinicians better not compare frequency perturbation data across systems and programs and amplitude perturbation data across systems. Finally, acoustic information can also be utilized as a biofeedback modality during voice exercises. Based on a systematic literature review, it was cautiously concluded that acoustic biofeedback can be a valuable tool in the treatment of phonatory disorders. When applied with caution, acoustic algorithms (particularly cepstrum-based measures and AVQI) have merited a special role in assessment and/or treatment of dysphonia severity
    • …
    corecore