130 research outputs found

    Acoustic analysis of chronic laryngitis - statistical analysis of sustained speech parameters

    Get PDF
    This paper describes the statistical analysis of a set of features extracted from the speech of sustained vowels of patients with chronic laryngitis and control subjects. The idea is to identify which features can be useful in a classification intelligent system to discriminate between pathologic and healthy voices. The set of features analysed consist in the Jitter, Shimmer Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of a sustained vowels /a/, /i/ and /u/ in a low, neutral and high tones. The results showed that besides the absolute Jitter, no statistical significance exist between male and female voices, considering the classification between pathologic or healthy. Any of the analysed parameters is likely to be a statistical difference between control and Chronic Laryngitis groups. This is an important information that these features can be used in an intelligent system to classify healthy from Chronic Laryngitis voices.info:eu-repo/semantics/publishedVersio

    Cured database of sustained speech parameters for chronic laryngitis pathology

    Get PDF
    This paper reports the construction and organization of a database of speech parameters extracted from a speech sound database. The database is freely available on internet and the paper intends also theirs advertise for the research community. The database includes the parameters extracted from the sound of sustained vowels produced by a group of Chronic Laryngitis patients and a group of control subjects with similar characteristics concerning gender and age. The set of parameters of this database consists in the Jitter, Shimmer, Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of sustained vowels /a/, /i/ and /u/ at low, neutral and high tones.info:eu-repo/semantics/publishedVersio

    Long short term memory on chronic laryngitis classification

    Get PDF
    The classification study with the use of machine learning concepts has been applied for years, and one of the aspects in which this can be applied is for the analysis of speech acoustics applied to the analysis of pathologies. Among the pathologies present, one of them is chronic laryngitis. Thus, this article aims to present the results for a classification of chronic laryngitis with the use of Long Short Term Memory as a classifier. The parameters of relative jitter, relative shimmer and autocorrelation was used as input of the LSTM. A dataset of about 1500 instances were used to train, validate and test along 4 experiments with LSTM and one feedforward Artificial Neural Network (ANN). The results of the LSTM overcome the ones of the feedforward ANN, and was about 100% accuracy, sensitivity and specificity in test set, denoting a promising future for this classification tool in the voice pathologies diagnose.info:eu-repo/semantics/publishedVersio

    Parameters for vocal acoustic analysis - cured database

    Get PDF
    This paper describes the construction and organization of a database of speech parameters extracted from a speech database. This article intends to inform the community about the existence of this database for future research. The database includes parameters extracted from sounds produced by patients distributed among 19 diseases and control subjects. The set of parameters of this database consists of the jitter, shimmer, Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR), autocorrelation and Mel Frequency Cepstral Coefficients (MFCC) extracted from the sound of sustained vowels /a/, /i/ and /u/ at the high, low and normal tones, and a short German sentence. The cured database has a total number of 707 pathological subjects (distributed by the various diseases) and 194 control subjects, in a total of 901 subjects.info:eu-repo/semantics/publishedVersio

    Clustering of voice pathologies based on sustained voice parameters

    Get PDF
    Signal processing techniques can be used to extract information that contribute to the detection of laryngeal disorders. The goal of this paper is to perform a statistical analysis through the boxplot tool from 832 voice signals of individuals with different laryngeal pathologies from the Saarbrücken Voice Database in order to create relevant groups, making feasible an automatic identification of these dysfunctions. Jitter, Shimmer, HNR, NHR and Autocorrelation features were compared between several groups of voice pathologies/conditions, resulting in three identified clusters.info:eu-repo/semantics/publishedVersio

    Outliers treatment to improve the recognition of voice pathologies

    Get PDF
    In some of the processes used in data analysis, such as the recognition of pathologies and pathological subjects, the presence of anomalous instances in the dataset is an unfavorable situation that can lead to misleading results. This article presents a function that implements the identification of anomalies in dataset using the boxplot and standard deviation methods. Also was used the filling technique to treat these anomalies, in which the anomalous point value were substituted by a limit value determined by the boxplot or standard deviation methods. To improve the outliers methods some normalization processes based on the z-score, logarithmic and squared root methodologies were experimented. These outliers treatment were applied to the dataset used in the recognition of vocal pathologies (dysphonia, chronic laryngitis and vocal cords paralysis vs control), performed by a MLP and LSTM neural networks. After the experiments, both the standard deviation and the boxplot methods with z-score normalization showed very useful for pre-processing the dataset for voice pathologies recognition. The accuracy was improved between 3 and 13 points in percentage.info:eu-repo/semantics/publishedVersio

    Deep-learning in identification of vocal pathologies

    Get PDF
    The work consists in a classification problem of four classes of vocal pathologies using one Deep Neural Network. Three groups of features extracted from speech of subjects with Dysphonia, Vocal Fold Paralysis, Laryngitis Chronica and controls were experimented. The best group of features are related with the source: relative jitter, relative shimmer, and HNR. A Deep Neural Network architecture with two levels were experimented. The first level consists in 7 estimators and second level a decision maker. In second level of the Deep Neural Network an accuracy of 39,5% is reached for a diagnosis among the 4 classes under analysis.info:eu-repo/semantics/publishedVersio

    Cepstral and Perceptual Investigations in Female Teachers With Functionally Healthy Voice

    Get PDF
    Purpose. The present study aimed at measuring the smoothed and non-smoothed cepstral peak prominence (CPPS and CPP) in teachers who considered themselves to have normal voice but some of them had laryngeal pathology. The changes of CPP, CPPS, sound pressure level (SPL) and perceptual ratings with different voice tasks were investigated and the influence of vocal pathology on these measures was studied. Method. Eighty-four Finnish female primary school teachers volunteered as participants. Laryngoscopically, 52.4% of these had laryngeal changes (39.3% mild, 13.1% disordered). Sound recordings were made for phonations of comfortable sustained vowel, comfortable speech, and speech produced at increased loudness level as used during teaching. CPP, CPPS and SPL values were extracted using Praat software for all three voice samples. Sound samples were also perceptually evaluated by five voice experts for overall voice quality (10 point scale from poor to excellent) and vocal firmness (10 point scale from breathy to pressed, with normal in the middle). Results. The CPP, CPPS and SPL values were significantly higher for vowels than for comfortable speech and for loud speech compared to comfortable speech (P 0.05). Conclusion. Neither the acoustic measures (CPP, CPPS, and SPL) nor the perceptual evaluations could clearly distinguish teachers with laryngeal changes from laryngeally healthy teachers. Considering no vocal complaints of the subjects, the data could be considered representative of teachers with functionally healthy voice.Peer reviewe

    Clustering pathologic voice with kohonen SOM and hierarchical clustering

    Get PDF
    The main purpose of clustering voice pathologies is the attempt to form large groups of subjects with similar pathologies to be used with Deep-Learning. This paper focuses on applying Kohonen's Self-Organizing Maps and Hierarchical Clustering to investigate how these methods behave in the clustering procedure of voice samples by means of the parameters absolute jitter, relative jitter, absolute shimmer, relative shimmer, HNR, NHR and Autocorrelation. For this, a comparison is made between the speech samples of the Control group of subjects, the Hyper-functional Dysphonia and Vocal Folds Paralysis pathologies groups of subjects. As a result, the dataset was divided in two clusters, with no distinction between the pre-defined groups of pathologies. The result is aligned with previous result using statistical analysis.This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.info:eu-repo/semantics/publishedVersio
    corecore