20 research outputs found

    Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease

    Get PDF
    There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD

    High accuracy discrimination of Parkinson's disease participants from healthy controls using smartphones

    Get PDF
    The aim of this study is to accurately distinguish Parkinson's disease (PD) participants from healthy controls using self-administered tests of gait and postural sway. Using consumer-grade smartphones with in-built accelerometers, we objectively measure and quantify key movement severity symptoms of Parkinson's disease. Specifically, we record tri-axial accelerations, and extract a range of different features based on the time and frequency-domain properties of the acceleration time series. The features quantify key characteristics of the acceleration time series, and enhance the underlying differences in the gait and postural sway accelerations between PD participants and controls. Using a random forest classifier, we demonstrate an average sensitivity of 98.5% and average specificity of 97.5% in discriminating PD participants from controls

    Voice Assessments for Detecting Patients with Parkinson’s Diseases in Different Stages

    Get PDF
    Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to detect patients with Parkinson’s disease (PD). So we have computed 19 dysphonia measures from sustained vowels collected from 375 voice samples from healthy and people suffer from PD. All the features are analysed and the more relevant ones are selected by the Principal component analysis (PCA) to classify the subjects in 4 classes according to the UPDRS (unified Parkinson’s disease Rating Scale) score. We used k-folds cross validation method with (k=4) validation scheme; 75% for training and 25% for testing, along with the Support Vector Machines (SVM) with its different types of kernels. The best result obtained was 92.5% using the PCA and the linear SVM

    Effective Detection of Parkinson’s Disease at Different Stages using Measurements of Dysphonia

    Get PDF
    This paper addressees the problem of multiclass of Parkinson’s disease by the characteristic features of person’s voice. So we computed 22 dysphonia measures from 375 voice samples of healthy and people suffer from PD. We used the particle swarm optimization (PSO) feature selection method, with random forest and the linear discriminant analysis (LDA) along with the 4-fold cross validation analysis to classify the subjects in 4 classes according to the severity of symptoms. With a classification accuracy score of 95.2%. Promisingly, the proposed diagnosis system might serve as a powerful tool for diagnosing PD, and could also extended for other voice pathologies

    Automatic detection of speech disorder in dysarthria using extended speech feature extraction and neural networks classification

    Get PDF
    This paper presents an automatic detection of Dysarthria, a motor speech disorder, using extended speech features called Centroid Formants. Centroid Formants are the weighted averages of the formants extracted from a speech signal. This involves extraction of the first four formants of a speech signal and averaging their weighted values. The weights are determined by the peak energies of the bands of frequency resonance, formants. The resulting weighted averages are called the Centroid Formants. In our proposed methodology, these centroid formants are used to automatically detect Dysarthric speech using neural network classification technique. The experimental results recorded after testing this algorithm are presented. The experimental data consists of 200 speech samples from 10 Dysarthric Speakers and 200 speech samples from 10 age-matched healthy speakers. The experimental results show a high performance using neural networks classification. A possible future research related to this work is the use of these extended features in speaker identification and recognition of disordered speech

    Predicción de la enfermedad de Parkinson utilizando redes neuronales convolucionales

    Get PDF
    La enfermedad de Parkinson (EP) es un desorden neurodegenerativo del sistema nervioso, de causa desconocida y curso crónico, progresivo e irreversible. En la actualidad se asume que los cambios patofisiológicos que permiten apreciar los síntomas de la enfermedad, no son visibles hasta al menos cuatro años luego de su inicio. Por esta causa, se buscan métodos alternativos que permitan detectar la enfermedad en forma temprana. Dado que las deficiencias del habla es uno de los síntomas de la enfermedad, esto puede dar origen a un biomarcador para el diagnóstico temprano y el monitoreo de la enfermedad. Este trabajo propone un estudio a partir del aprendizaje profundo de los espectrogramas obtenidos de señales de voz grabadas con celulares. Como objetivo se plantea realizar aportes al diagnóstico de EP, contribuyendo asimismo al conocimiento de las características de la voz afectadas por la enfermedad. Para tal fin se creará una base de datos de espectrogramas de los segmentos de audio que mejor permitan caracterizar la voz de los EP. Se desarrollarán modelos de redes neuronales convolucionales con distintas arquitecturas para distinguir los EP de los pacientes sanos, utilizando la validación adecuada para las características de dichos datos.Base de Datos y Minería de Datos.Red de Universidades con Carreras en Informátic

    Behavioral Indicators on a Mobile Sensing Platform Predict Clinically Validated Psychiatric Symptoms of Mood and Anxiety Disorders

    Get PDF
    Background: There is a critical need for real-time tracking of behavioral indicators of mental disorders. Mobile sensing platforms that objectively and noninvasively collect, store, and analyze behavioral indicators have not yet been clinically validated or scalable. Objective: The aim of our study was to report on models of clinical symptoms for post-traumatic stress disorder (PTSD) and depression derived from a scalable mobile sensing platform. Methods: A total of 73 participants (67% [49/73] male, 48% [35/73] non-Hispanic white, 33% [24/73] veteran status) who reported at least one symptom of PTSD or depression completed a 12-week field trial. Behavioral indicators were collected through the noninvasive mobile sensing platform on participants’ mobile phones. Clinical symptoms were measured through validated clinical interviews with a licensed clinical social worker. A combination hypothesis and data-driven approach was used to derive key features for modeling symptoms, including the sum of outgoing calls, count of unique numbers texted, absolute distance traveled, dynamic variation of the voice, speaking rate, and voice quality. Participants also reported ease of use and data sharing concerns. Results: Behavioral indicators predicted clinically assessed symptoms of depression and PTSD (cross-validated area under the curve [AUC] for depressed mood=.74, fatigue=.56, interest in activities=.75, and social connectedness=.83). Participants reported comfort sharing individual data with physicians (Mean 3.08, SD 1.22), mental health providers (Mean 3.25, SD 1.39), and medical researchers (Mean 3.03, SD 1.36). Conclusions: Behavioral indicators passively collected through a mobile sensing platform predicted symptoms of depression and PTSD. The use of mobile sensing platforms can provide clinically validated behavioral indicators in real time; however, further validation of these models and this platform in large clinical samples is needed.United States. Defense Advanced Research Projects Agency (contract N66001-11-C-4094

    Voice Analysis and Classification System Based on Perturbation Parameters and Cepstral Presentation in Psychoacoustic Scales

    Get PDF
    Описан подход к построению системы анализа и классификации голосового сигнала на основе пертурбационных параметров и кепстрального представления. Рассмотрены два варианта кепстрального представления голосового сигнала: при помощи мел-частотных кепстральных коэффициентов (МЧКК) и при помощи барк-частотных кепстральных коэффициентов (БЧКК). В работе использовался общепринятый подход к вычислению МЧКК на основе частотно-временного анализа методом дискретного преобразования Фурье (ДПФ) с объединением энергии в субполосах. Данный метод аппроксимирует частотное разрешение слуха человека, но имеет фиксированное временное разрешение. В качестве альтернативы предложен вариант кепстрального представления на основе БЧКК. При расчете БЧКК использовался неравнополосный ДПФ-модулированный банк фильтров, аппроксимирующий частотную и временную разрешающую способность слуха. Целью работы ставилось сравнение эффективности применения признаков на основе МЧКК и БЧКК для построения систем анализа и классификации голосового сигнала. Результаты эксперимента показали, что в случае использования акустических признаков на основе МЧКК можно получить систему классификации голоса со средней полнотой классификации 80,6 %, а в случае использовании признаков на основе БЧКК этот показатель равен 83,7 %. При дополнении набора МЧКК признаков пертурбационными параметрами голоса средняя полнота классификации повысилась до 94,1 %, при аналогичном дополнении набора БЧКК признаков средняя полнота классификации увеличилась до 96,7 %. The paper describes an approach to design a system for analyzing and classification of a voice signal based on perturbation parameters and cepstral representation. Two variants of the cepstral representation of the voice signal are considered: based on mel-frequency cepstral coefficients (MFCC) and based on bark-frequency cepstral coefficients (BFCC). The work used a generally accepted approach to calculating the MFCC based on the time-frequency analysis by the method of discrete Fourier transform (DFT) with summation of energy in subbands. This method approximates the frequency resolution of human hearing, but has a fixed temporal resolution. As an alternative, a variant of the cepstral representation based on the BFCC has been proposed. When calculating the BFCC, a warped DFT-modulated filter bank was used, which approximates the frequency and temporal resolution of hearing. The aim of the work was to compare the effectiveness of the use of features based on the MFCC and BFCC for the designing systems for the analysis and classification of the voice signal. The results of the experiment showed that in the case when using acoustic features based on the MFCC, it is possible to obtain a voice classification system with an average recall of 80.6 %, and in the case when using features based on the BFCC, this metric is 83.7 %. With the addition of the set of MFCC features with perturbation parameters of the voice, the average recall of the classification increased to 94.1 %, with a similar addition to the set of BFCC features, the average recall of the classification increased up to 96.7 %

    Система анализа и классификации голосового сигнала на основе пертрубационных параметров и кепстрального представления в психоакустических шкалах

    Get PDF
    The paper describes an approach to design a system for analyzing and classification of a voice signal based on perturbation parameters and cepstral representation. Two variants of the cepstral representation of the voice signal are considered: based on mel-frequency cepstral coefficients (MFCC) and based on bark-frequency cepstral coefficients (BFCC). The work used a generally accepted approach to calculating the MFCC based on the time-frequency analysis by the method of discrete Fourier transform (DFT) with summation of energy in subbands. This method approximates the frequency resolution of human hearing, but has a fixed temporal resolution. As an alternative, a variant of the cepstral representation based on the BFCC has been proposed. When calculating the BFCC, a warped DFT-modulated filter bank was used, which approximates the frequency and temporal resolution of hearing. The aim of the work was to compare the effectiveness of the use of features based on the MFCC and BFCC for the designing systems for the analysis and classification of the voice signal. The results of the experiment showed that in the case when using acoustic features based on the MFCC, it is possible to obtain a voice classification system with an average recall of 80.6 %, and in the case when using features based on the BFCC, this metric is 83.7 %. With the addition of the set of MFCC features with perturbation parameters of the voice, the average recall of the classification increased to 94.1 %, with a similar addition to the set of BFCC features, the average recall of the classification increased up to 96.7 %.Описан подход к построению системы анализа и классификации голосового сигнала на основе пертурбационных параметров и кепстрального представления. Рассмотрены два варианта кепстрального представления голосового сигнала: при помощи мел-частотных кепстральных коэффициентов (МЧКК) и при помощи барк-частотных кепстральных коэффициентов (БЧКК). В работе использовался общепринятый подход к вычислению МЧКК на основе частотно-временного анализа методом дискретного преобразования Фурье (ДПФ) с объединением энергии в субполосах. Данный метод аппроксимирует частотное разрешение слуха человека, но имеет фиксированное временное разрешение. В качестве альтернативы предложен вариант кепстрального представления на основе БЧКК. При расчете БЧКК использовался неравнополосный ДПФ-модулированный банк фильтров, аппроксимирующий частотную и временную разрешающую способность слуха. Целью работы ставилось сравнение эффективности применения признаков на основе МЧКК и БЧКК для построения систем анализа и классификации голосового сигнала. Результаты эксперимента показали, что в случае использования акустических признаков на основе МЧКК можно получить систему классификации голоса со средней полнотой классификации 80,6 %, а в случае использовании признаков на основе БЧКК этот показатель равен 83,7 %. При дополнении набора МЧКК признаков пертурбационными параметрами голоса средняя полнота классификации повысилась до 94,1 %, при аналогичном дополнении набора БЧКК признаков средняя полнота классификации увеличилась до 96,7 %

    It Sounds Like You Have a Cold! Testing Voice Features for the Interspeech 2017 Computational Paralinguistics Cold Challenge

    Get PDF
    This paper describes an evaluation of four different voice feature sets for detecting symptoms of the common cold in speech as part of the Interspeech 2017 Computational Paralinguistics Challenge. The challenge corpus consists of 630 speakers in three partitions, of which approximately one third had a “severe” cold at the time of recording. Success on the task is measured in terms of unweighted average recall of cold/not-cold classification from short extracts of the recordings. In this paper we review previous voice features used for studying changes in health and devise four basic types of features for evaluation: voice quality features, vowel spectra features, modulation spectra features, and spectral distribution features. The evaluation shows that each feature set provides some useful information to the task, with features from the modulation spectrogram being most effective. Feature-level fusion of the feature sets shows small performance improvements on the development test set. We discuss the results in terms of the most suitable features for detecting symptoms of cold and address issues arising from the design of the challenge
    corecore