Search CORE

1,689 research outputs found

Screening of Obstructive Sleep Apnea with Empirical Mode Decomposition of Pulse Oximetry

Author: Di Persia Leandro E.
Larrateguy Luis D.
Milone Diego H.
Schlotthauer Gastón
Publication venue: 'Elsevier BV'
Publication date: 29/05/2014
Field of study

Detection of desaturations on the pulse oximetry signal is of great importance for the diagnosis of sleep apneas. Using the counting of desaturations, an index can be built to help in the diagnosis of severe cases of obstructive sleep apnea-hypopnea syndrome. It is important to have automatic detection methods that allows the screening for this syndrome, reducing the need of the expensive polysomnography based studies. In this paper a novel recognition method based on the empirical mode decomposition of the pulse oximetry signal is proposed. The desaturations produce a very specific wave pattern that is extracted in the modes of the decomposition. Using this information, a detector based on properly selected thresholds and a set of simple rules is built. The oxygen desaturation index constructed from these detections produces a detector for obstructive sleep apnea-hypopnea syndrome with high sensitivity (

0.838

) and specificity (

0.855

) and yields better results than standard desaturation detection approaches.Comment: Accepted in Medical Engineering and Physic

arXiv.org e-Print Archive

CONICET Digital

Voice pathology detection using interlaced derivative pattern on glottal source excitation

Author: Al-nasheri Ahmed
Ali Zulfiqar
Alsulaiman Mansour
Bencherif Mohamed A.
Farahat Mohamed
Malki Khalid H.
Mesallam Tamer A.
Muhammad Ghulam
Publication venue: 'Elsevier BV'
Publication date: 31/01/2017
Field of study

Ulster University's Research Portal

Linear Classifier with Reject Option for the Detection of Vocal Fold Paralysis and Vocal Fold Edema

Author
Publication venue: Springer
Publication date: 22/09/2009
Field of study

Springer - Publisher Connector

CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Author: BELHOUSSINE DRISSI Taoufiq
BOUALOULOU Nouhaila
NSIRI Benayad
Publication venue: Lublin University of Technology
Publication date: 30/06/2023
Field of study

Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC

Lublin University of Technology Journals

Recommended from our members

A novel framework for high-quality voice source analysis and synthesis

Author: Turajlic Emir
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2006
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The analysis, parameterization and modeling of voice source estimates obtained via inverse filtering of recorded speech are some of the most challenging areas of speech processing owing to the fact humans produce a wide range of voice source realizations and that the voice source estimates commonly contain artifacts due to the non-linear time-varying source-filter coupling. Currently, the most widely adopted representation of voice source signal is Liljencrants-Fant's (LF) model which was developed in late 1985. Due to the overly simplistic interpretation of voice source dynamics, LF model can not represent the fine temporal structure of glottal flow derivative realizations nor can it carry the sufficient spectral richness to facilitate a truly natural sounding speech synthesis. In this thesis we have introduced Characteristic Glottal Pulse Waveform Parameterization and Modeling (CGPWPM) which constitutes an entirely novel framework for voice source analysis, parameterization and reconstruction. In comparative evaluation of CGPWPM and LF model we have demonstrated that the proposed method is able to preserve higher levels of speaker dependant information from the voice source estimates and realize a more natural sounding speech synthesis. In general, we have shown that CGPWPM-based speech synthesis rates highly on the scale of absolute perceptual acceptability and that speech signals are faithfully reconstructed on consistent basis, across speakers, gender. We have applied CGPWPM to voice quality profiling and text-independent voice quality conversion method. The proposed voice conversion method is able to achieve the desired perceptual effects and the modified speech remained as natural sounding and intelligible as natural speech. In this thesis, we have also developed an optimal wavelet thresholding strategy for voice source signals which is able to suppress aspiration noise and still retain both the slow and the rapid variations in the voice source estimate

Brunel University Research Archive

An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification

Author: Al-nasheri Ahmed
Ali Zulfiqar
Alsulaiman Mansour
Bencherif Mohamed A
Farahat Mohamed
Malki Khalid H
Mesallam Tamer A
Muhammad Ghulam
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Background and Objective Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. Materials and Methods Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. Results The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively

University of Essex Research Repository

Ulster University's Research Portal

A novel hybrid method for vocal fold pathology diagnosis based on russian language

Author: V. Majidnezhad
Publication venue: 'International Ocean Discovery Program (IODP)'
Publication date: 01/07/2014
Field of study

In this paper, first, an initial feature vector for vocal fold pathology diagnosis is proposed. Then, for optimizing the initial feature vector, a genetic algorithm is proposed. Some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different classifiers (ensemble of decision tree, discriminant analysis and K-nearest neighbours) and the different feature vectors (the initial and the optimized ones). Finally, a hybrid of the ensemble of decision tree and the genetic algorithm is proposed for vocal fold pathology diagnosis based on Russian Language. The experimental results show a better performance (the higher classification accuracy and the lower response time) of the proposed method in comparison with the others. While the usage of pure decision tree leads to the classification accuracy of 85.4% for vocal fold pathology diagnosis based on Russian language, the proposed method leads to the 8.5% improvement (the accuracy of 93.9%)

Directory of Open Access Journals

Glottal Source biometrical signature for voice pathology detection

Author: Agustín Álvarez-Marquina
Akande
Berry
Bimbot
Boyanov
De Oliveira Rosa
Deller
Fant
Godino
Godino
Gómez
Gómez
Hadjitodorov
Hirano
Holmberg
Jackson
Johnson
Juan Ignacio Godino-Llorente
Luis Miguel Mazaira-Fernández
Nickel
Parsa
Pedro Gómez-Vilda
Price
Rafael Martínez-Olalla
Ritchings
Roberto Fernández-Baillo
Rodellar
Ruiz
Shalvi
Story
Victoria Rodellar-Biarge
Víctor Nieto Lluis
Whiteside
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Assessment of vocal cord nodules: A case study in speech processing by using Hilbert-Huang Transform

Author: C M Filosi
C Surace
Dejonckere P H
Deliyski D D
Fontugne R
Gerhard D
Hirano M
Huang N E
K Worden
Kizhner S
M Civera
M Silvestrini
McKinney J C
N M Pugno
Nicastri M
Pedersen M
Prater R J
Stemple J C
Verdolini K
Publication venue: 'IOP Publishing'
Publication date: 02/06/2017
Field of study

Vocal cord nodules represent a pathological condition for which the growth of unnatural masses on vocal folds affects the patients. Among other effects, changes in the vocal cords' overall mass and stiffness alter their vibratory behaviour, thus changing the vocal emission generated by them. This causes dysphonia, i.e. abnormalities in the patients' voice, which can be analysed and inspected via audio signals. However, the evaluation of voice condition through speech processing is not a trivial task, as standard methods based on the Fourier Transform, fail to fit the non-stationary nature of vocal signals. In this study, four audio tracks, provided by a volunteer patient, whose vocal fold nodules have been surgically removed, were analysed using a relatively new technique: the Hilbert-Huang Transform (HHT) via Empirical Mode Decomposition (EMD); specifically, by using the CEEMDAN (Complete Ensemble EMD with Adaptive Noise) algorithm. This method has been applied here to speech signals, which were recorded before removal surgery and during convalescence, to investigate specific trends. Possibilities offered by the HHT are exposed, but also some limitations of decomposing the signals into so-called intrinsic mode functions (IMFs) are highlighted. The results of these preliminary studies are intended to be a basis for the development of new viable alternatives to the softwares currently used for the analysis and evaluation of pathological voice

Crossref

Queen Mary Research Online

White Rose Research Online