Search CORE

16 research outputs found

Length analysis of speech to be recorded in the recognition of Parkinson's disease

Author: Jenei Attila Zoltán
Sztahó Dávid
Publication venue
Publication date: 01/01/2022
Field of study

Parkinson's disease is an incurable neurodegenerative disease to the present clinical knowledge. It is diagnosed mostly by exclusion tests. Numerous studies have confirmed that speech can be promising to suspect the presence of the disease. On the other hand, just a few researches discuss the appropriate length of the speech sample or the contribution of parts of the full-length recordings in the classification. Hence, we partitioned each original recording into four shorter samples. We trained linear and radial basis function (rbf) kernel Support Vector Machine (SVM) models separately for original recordings, each partitioned group and all partitioned samples together. We found no significant difference between the results of the rbf kernel models. However, we obtained significantly better results with a portion of the entire speech using linear kernel models. In conclusion, even a shorter piece of a longer speech may be adequate for classification

University of Szeged

CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Author: BELHOUSSINE DRISSI Taoufiq
BOUALOULOU Nouhaila
NSIRI Benayad
Publication venue: Lublin University of Technology
Publication date: 30/06/2023
Field of study

Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC

Lublin University of Technology Journals

A Review of the Assessment Methods of Voice Disorders in the Context of Parkinson's Disease

Author: Abdelilah A.
Benba A.
Hammouch A.
Publication venue: Journal of Telecommunication, Electronic and Computer Engineering (JTEC)
Publication date: 01/12/2016
Field of study

In recent years, a significant progress in the field of research dedicated to the treatment of disabilities has been witnessed. This is particularly true for neurological diseases, which generally influence the system that controls the execution of learned motor patterns. In addition to its importance for communication with the outside world and interaction with others, the voice is a reflection of our personality, moods and emotions. It is a way to provide information on health status, shape, intentions, age and even the social environment. It is also a working tool for many, but an important element of life for all. Patients with Parkinson’s disease (PD) are numerous and they suffer from hypokinetic dysarthria, which is manifested in all aspects of speech production: respiration, phonation, articulation, nasalization and prosody. This paper provides a review of the methods of the assessment of speech disorders in the context of PD and also discusses the limitations

Universiti Teknikal Malaysia Melaka: UTeM Open Journal System

It diagnostics of parkinson's disease based on voice markers and decreased motor activity

Author: Vishniakou U. А.
Yiwei X.
Publication venue: БНТУ
Publication date
Field of study

The objectives of the article to propose the method for complex recognition of Parkinson's disease using machine learning, based on markers of voice analysis and changes in patient movements on known data sets. The time-frequency function, (the wavelet function) and the Meyer kepstral coefficient function are used. The KNN algorithm and the algorithm of a two-layer neural network were used for training and testing on publicly available datasets on speech changes and motion retardation in Parkinson's disease. A Bayesian optimizer was also used to improve the hyperparameters of the KNN algorithm. The constructed models achieved an accuracy of 94.7 % and 96.2 % on a data set on speech changes in patients with Parkinson's disease and a data set on slowing down the movement of patients, respectively. The recognition results are close to the world level. The proposed technique is intended for use in the subsystem of IT diagnostics of nervous diseases

Repository of Belarusian National Technical University (BNTU)

Recognition of signs of Parkinson's disease based on the analysis of voice markers and motor activity

Author: Вишняков В. А.
Ся Ивэй
Publication venue: Объединенный институт проблем информатики Национальной академии наук Беларуси
Publication date: 01/01/2023
Field of study

Решается задача ИТ-диагностики признаков болезни Паркинсона по анализу изменения голоса и замедления движения пациентов. Актуальность задачи связана с необходимостью ранней диагностики заболевания. Предлагается метод комплексного распознавания болезни Паркинсона с использованием машинного обучения, основанный на анализе голосовых маркеров и изменений в движениях пациентов на известных наборах данных. Методы. Используются частотно-временная функция (функция вейвлета), функция кепстрального коэффициента Мейера, алгоритм k-ближайших соседей (k-Nearest Neighbors, KNN), алгоритм двухслойной нейронной сети для обучения и тестирования на общедоступных наборах данных по изменению речи и замедлению движения при болезни Паркинсона, а также байесовский оптимизатор для улучшения ги- перпараметров алгоритма KNN. Результаты. Алгоритм KNN использован для распознавания речи пациентов, точность теста 94,7 % достигнута при диагностике болезни Паркинсона по изменению голоса. Алгоритм байесовской нейронной сети применен для распознавания замедления движения пациентов, он дал точность теста 96,2 %. Заключение. Полученные результаты распознавания признаков болезни Паркинсона близки к мировому уровню. На том же наборе данных по изменению речи пациентов один из лучших показателей зарубежных исследований составляет 95,8 %, а на наборе данных по замедлению движения пациентов – 98,8 %. Предлагаемая авторская методика предназначена для использования в подсистеме ИТ-диагностики неврологических заболеваний умного города

Belarusian State University of Informatics and Radioelectronics Repository

Распознавание признаков болезни Паркинсона на основе анализа голосовых маркеров и двигательной активности

Author: U. A. Vishniakou
Xia Yiwei
В. А. Вишняков
Ся Ивэй
Publication venue: UIIP NASB
Publication date: 29/09/2023
Field of study

Objectives. The problem of IT diagnostics of signs of Parkinson's disease is solved by analyzing changes in the voice and slowing down the movement of patients. The urgency of the task is associated with the need for early diagnosis of the disease. A method of complex recognition of Parkinson's disease using machine learning is proposed, based on markers of voice analysis and changes in the patient's movements on known data sets.Methods. The time-frequency function (the wavelet function) and the Meyer kepstral coefficient function, the KNN algorithm (k-Nearest Neighbors, KNN) and the algorithm of a two-layer neural network are used for training and testing on publicly available datasets on speech changes and motion retardation in Parkinson's disease. A Bayesian optimizer is also used to improve the hyperparameters of the KNN algorithm.Results. The KNN algorithm was used for speech recognition of patients, the test accuracy of 94.7% was achieved in the diagnosis of Parkinson's disease by voice change. The Bayesian neural network algorithm was applied to recognize the slowing down of the patients' movements, it gave a test accuracy of 96.2% for the diagnosis of Parkinson's disease.Conclusion. The obtained results of recognition of signs of Parkinson's disease are close to the world level. On the same set of data on speech changes of patients, one of the best indicators of foreign studies is 95.8%. On the same set of data on motion deceleration, one of the best indicators of foreign researchers is 98.8%. The proposed author's technique is intended for use in the subsystem of IT diagnostics of neurological diseases of a Smart city.Цели. Решается задача ИТ-диагностики признаков болезни Паркинсона по анализу изменения голоса и замедления движения пациентов. Актуальность задачи связана с необходимостью ранней диагностики заболевания. Предлагается метод комплексного распознавания болезни Паркинсона с использованием машинного обучения, основанный на анализе голосовых маркеров и изменений в движениях пациентов на известных наборах данных.Методы. Используются частотно-временная функция (функция вейвлета), функция кепстрального коэффициента Мейера, алгоритм k-ближайших соседей (k-Nearest Neighbors, KNN), алгоритм двухслойной нейронной сети для обучения и тестирования на общедоступных наборах данных по изменению речи и замедлению движения при болезни Паркинсона, а также байесовский оптимизатор для улучшения гиперпараметров алгоритма KNN.Результаты. Алгоритм KNN использован для распознавания речи пациентов, точность теста 94,7 % достигнута при диагностике болезни Паркинсона по изменению голоса. Алгоритм байесовской нейронной сети применен для распознавания замедления движения пациентов, он дал точность теста 96,2 %.Заключение. Полученные результаты распознавания признаков болезни Паркинсона близки к мировому уровню. На том же наборе данных по изменению речи пациентов один из лучших показателей зарубежных исследований составляет 95,8 %, а на наборе данных по замедлению движения пациентов - 98,8 %. Предлагаемая авторская методика предназначена для использования в подсистеме ИТ-диагностики неврологических заболеваний умного города

Informatics (E-Journal) / Информатика

A Review on Human-Computer Interaction and Intelligent Robots

Author: Bao Yanwei
Ren Fuji
Publication venue: World Scientific Publishing House Ltd
Publication date: 26/10/2020
Field of study

In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research

Tokushima University Institutional Repository

Your Voice Gave You Away: the Privacy Risks of Voice-Inferred Information

Author: Ritter Emma
Publication venue: Duke University School of Law
Publication date: 18/11/2021
Field of study

Our voices can reveal intimate details about our lives. Yet, many privacy discussions have focused on the threats from speaker recognition and speech recognition. This Note argues that this focus overlooks another privacy risk: voice-inferred information. This term describes non-obvious information drawn from voice data through a combination of machine learning, artificial intelligence, data mining, and natural language processing. Companies have latched onto voiceinferred information. Early adopters have applied the technology in situations as varied as lending risk analysis and hiring. Consumers may balk at such strategies, but the current United States privacy regime leaves voice insights unprotected. By applying a notice and consent privacy model via sector-specific statutes, the hodgepodge of U.S. federal privacy laws allows voice-inferred information to slip through the regulatory cracks. This Note reviews the current legal landscape and identifies existing gaps. It then suggests two solutions that balance voice privacy with technological innovation: purpose-based consent and independent data review boards. The first bolsters voice protection within the traditional notice and consent framework, while the second imagines a new protective scheme. Together, these solutions complement each other to afford the human voice the protection it deserves

bepress Legal Repository

Duke Law Scholarship Repository

Privacy-Protecting Techniques for Behavioral Data: A Survey

Author: Arias-Cabarcos Patricia
Hanisch Simon
Parra-Arnau Javier
Strufe Thorsten
Publication venue: arxiv
Publication date: 12/11/2021
Field of study

Our behavior (the way we talk, walk, or think) is unique and can be used as a biometric trait. It also correlates with sensitive attributes like emotions. Hence, techniques to protect individuals privacy against unwanted inferences are required. To consolidate knowledge in this area, we systematically reviewed applicable anonymization techniques. We taxonomize and compare existing solutions regarding privacy goals, conceptual operation, advantages, and limitations. Our analysis shows that some behavioral traits (e.g., voice) have received much attention, while others (e.g., eye-gaze, brainwaves) are mostly neglected. We also find that the evaluation methodology of behavioral anonymization techniques can be further improved

KITopen

Adaptation of Speaker and Speech Recognition Methods for the Automatic Screening of Speech Disorders using Machine Learning

Author: Egas López José Vicente
Publication venue
Publication date
Field of study

This PhD thesis presented methods for exploiting the non-verbal communication of individuals suffering from specific diseases or health conditions aiming to reach an automatic screening of them. More specifically, we employed one of the pillars of non-verbal communication, paralanguage, to explore techniques that could be utilized to model the speech of subjects. Paralanguage is a non-lexical component of communication that relies on intonation, pitch, speed of talking, and others, which can be processed and analyzed in an automatic manner. This is called Computational Paralinguistics, which can be defined as the study of modeling non-verbal latent patterns within the speech of a speaker by means of computational algorithms; these patterns go beyond the linguistic} approach. By means of machine learning, we present models from distinct scenarios of both paralinguistics and pathological speech which are capable of estimating the health status of a given disease such as Alzheimer's, Parkinson's, and clinical depression, among others, in an automatic manner

SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)