11 research outputs found

    Acoustic analysis of gender-related patterns in Parkinson's disease

    Get PDF
    Táto bakalárska práca sa zaoberá výskumom rozdielov pôsobenia hypokinetickej dyzartrie u mužov a žien, analyzovaním rečového cvičenia čítanie štandardizovaného textu. Parkinsonova choroba sa prejavuje u všetkých subsystémov podieľajúcich sa na tvorbe reči (dýchanie, fonácia, artikulácia a prozódia). Cieľmi práce je oboznámenie sa s príznakmi tohto poškodenia a rečovými parametrami, ktoré sú týmto poškodením ovplyvnené. Práca obsahuje popísané predspracovanie, parametrizáciu rečového signálu a následnú štatistickú analýzu vybraných parametrov. Systém na spracovanie rečových záznamov je vytvorený v programovacom jazyku MATLAB.The bachelor's thesis is about acoustic analysis of gender-related patterns in Parkinson's disease by analysing speech task: reading passage. Parkinson's disease manifests in all subsystems involved in speech production (respiration, phonation, articulation and prosody). The aim of this thesis is familirization with symptoms of this disorder and speech parameters influenced by this disorder. Thesis contains preprocessing, parametrization of speech signal and statistic analysis of parameters. System of speech signal processing is created in MATLAB programming language.

    Degree of Parkinson's Disease Severity Estimation Based on Speech Signal Processing

    Get PDF
    International audienceThis paper deals with Parkinson's disease (PD) severity estimation according to the Unified Parkinson's Disease Rating Scale: motor subscale (UPDRS III), which quantifies the hallmark symptoms of PD, using an acoustic analysis of speech signals. Experimental dataset comprised 42 speech tasks acquired from 50 PD patients (UPDRS III ranged from 6 to 92). It was divided into subsets: words, sentences, reading text, monologue and diadochokinetic tasks. We performed a parametrization of the whole corpus and these groups separately using a wide range of conventional and novel speech features. We used guided regu-larized random forest algorithm to select features with maximum clinical information and performed random forests regression to estimate PD severity. According to significant correlations between true UPDRS III scores and scores predicted by the proposed methodology it was shown that information extracted through variety of speech tasks can be used to estimate the degree of PD severity

    Neuromechanical Modelling of Articulatory Movements from Surface Electromyography and Speech Formants

    Get PDF
    Speech articulation is produced by the movements of muscles in the larynx, pharynx, mouth and face. Therefore speech shows acoustic features as formants which are directly related with neuromotor actions of these muscles. The first two formants are strongly related with jaw and tongue muscular activity. Speech can be used as a simple and ubiquitous signal, easy to record and process, either locally or on e-Health platforms. This fact may open a wide set of applications in the study of functional grading and monitoring neurodegenerative diseases. A relevant question, in this sense, is how far speech correlates and neuromotor actions are related. This preliminary study is intended to find answers to this question by using surface electromyographic recordings on the masseter and the acoustic kinematics related with the first formant. It is shown in the study that relevant correlations can be found among the surface electromyographic activity (dynamic muscle behavior) and the positions and first derivatives of the first formant (kinematic variables related to vertical velocity and acceleration of the joint jaw and tongue biomechanical system). As an application example, it is shown that the probability density function associated to these kinematic variables is more sensitive than classical features as Vowel Space Area (VSA) or Formant Centralization Ratio (FCR) in characterizing neuromotor degeneration in Parkinson's Disease.This work is being funded by Grants TEC2016-77791-C4-4-R from the Ministry of Economic Affairs and Competitiveness of Spain, Teka-Park 55 02 CENIE-0348_CIE_6_E POCTEP (InterReg Programme) and 16-30805A, SIX Research Center (CZ.1.05/2.1.00/03.0072), and LO1401 from the Czech Republic Government

    Computerized analysis of hypomimia and hypokinetic dysarthria for improved diagnosis of Parkinson's disease

    Get PDF
    Background and Objective: An aging society requires easy-to-use approaches for diagnosis and monitoring of neurodegenerative disorders, such as Parkinson's disease (PD), so that clinicians can effectively adjust a treatment policy and improve patients' quality of life. Current methods of PD diagnosis and monitoring usually require the patients to come to a hospital, where they undergo several neurological and neuropsychological examinations. These examinations are usually time-consuming, expensive, and performed just a few times per year. Hence, this study explores the possibility of fusing computerized analysis of hypomimia and hypokinetic dysarthria (two motor symptoms manifested in the majority of PD patients) with the goal of proposing a new methodology of PD diagnosis that could be easily integrated into mHealth systems. Methods: We enrolled 73 PD patients and 46 age- and gender-matched healthy controls, who performed several speech/voice tasks while recorded by a microphone and a camera. Acoustic signals were parametrized in the fields of phonation, articulation and prosody. Video recordings of a face were analyzed in terms of facial landmarks movement. Both modalities were consequently modeled by the XGBoost algorithm. Results: The acoustic analysis enabled diagnosis of PD with 77% balanced accuracy, while in the case of the facial analysis, we observed 81% balanced accuracy. The fusion of both modalities increased the balanced accuracy to 83% (88% sensitivity and 78% specificity). The most informative speech exercise in the multimodality system turned out to be a tongue twister. Additionally, we identified muscle movements that are characteristic of hypomimia. Conclusions: The introduced methodology, which is based on the myriad of speech exercises likewise audio and video modality, allows for the detection of PD with an accuracy of up to 83%. The speech exercise - tongue twisters occurred to be the most valuable from the clinical point of view. Additionally, the clinical interpretation of the created models is illustrated. The presented computer-supported methodology could serve as an extra tool for neurologists in PD detection and the proposed potential solution of mHealth will facilitate the patient's and doctor's life.Peer reviewe

    Hondatze kognitibo arinaren detekzio goiztiarrerako hizketa ezagutza automatikoan oinarrituriko ekarpenak

    Get PDF
    302 p.Alzheimerdun gaixoengan, mintzamena ez ezik, erantzun emozionala ere kaltetu egiten da. Emozioak giza gogoaren arkitekturarekin zerikusia dituzten prozesu kognitiboak dira, eta erabakiak hartzearekin eta oroimenaren kudeaketa edota arretarekin zerikusia dute, eta aldi berean ere, horiek hertsiki lotuta dauden komunikazioarekin. Hortaz, erantzun eta kudeaketa emozionalak ere badira gaitzaren hasierako fase horietan nahasten diren beste komunikazio-elementu batzuk, eta disfluentzia bezala, emozio-erantzuna narriadura kognitiboa neurtzeko adierazlea izan daiteke.Hortaz, zenbait atazaren bidez sortutako ahots-laginen azterketak direla medio, disfluentzia eta emozio-erantzuna jaso daitezke. Hizkuntzarekiko independenteak diren parametroak bildu eta horien hizkeraren nahasmenduak ezaugarritu badaitezke, ekarpena lagungarria izan daiteke diagnostikoa egingo duten espezialistentzat.Lehengaiak ahots-laginak direnez, ingurune kliniko zein etxeko ingurunean egindako ataza desberdinen bidez grabazioak egin eta datu-baseak osatu dira, osasun-guneen irizpide etikoak kontuan hartuta eta. Datu-base horien ikerketaren bidez, galera kognitiboaren garapena neurtu, kuantifikatu, balioztatu eta sailkatu nahi da. Gaitzaren etapa desberdinak hautematen laguntzeko ekarpena egin nahi da, eta horretarako, hizkuntzarekiko independenteak diren parametroen azterketa automatikorako teknika eta metodologiak garatu dira. Mintzamen automatikoaren analisian oinarritutako multi-hurbilketa ez-lineala egin da, zeinak hizketa-analisian erabiltzen diren denborazko serieen konplexutasunaren neurtze kuantitatiboa eman diezaguke

    Hondatze kognitibo arinaren detekzio goiztiarrerako hizketa ezagutza automatikoan oinarrituriko ekarpenak

    Get PDF
    302 p.Alzheimerdun gaixoengan, mintzamena ez ezik, erantzun emozionala ere kaltetu egiten da. Emozioak giza gogoaren arkitekturarekin zerikusia dituzten prozesu kognitiboak dira, eta erabakiak hartzearekin eta oroimenaren kudeaketa edota arretarekin zerikusia dute, eta aldi berean ere, horiek hertsiki lotuta dauden komunikazioarekin. Hortaz, erantzun eta kudeaketa emozionalak ere badira gaitzaren hasierako fase horietan nahasten diren beste komunikazio-elementu batzuk, eta disfluentzia bezala, emozio-erantzuna narriadura kognitiboa neurtzeko adierazlea izan daiteke.Hortaz, zenbait atazaren bidez sortutako ahots-laginen azterketak direla medio, disfluentzia eta emozio-erantzuna jaso daitezke. Hizkuntzarekiko independenteak diren parametroak bildu eta horien hizkeraren nahasmenduak ezaugarritu badaitezke, ekarpena lagungarria izan daiteke diagnostikoa egingo duten espezialistentzat.Lehengaiak ahots-laginak direnez, ingurune kliniko zein etxeko ingurunean egindako ataza desberdinen bidez grabazioak egin eta datu-baseak osatu dira, osasun-guneen irizpide etikoak kontuan hartuta eta. Datu-base horien ikerketaren bidez, galera kognitiboaren garapena neurtu, kuantifikatu, balioztatu eta sailkatu nahi da. Gaitzaren etapa desberdinak hautematen laguntzeko ekarpena egin nahi da, eta horretarako, hizkuntzarekiko independenteak diren parametroen azterketa automatikorako teknika eta metodologiak garatu dira. Mintzamen automatikoaren analisian oinarritutako multi-hurbilketa ez-lineala egin da, zeinak hizketa-analisian erabiltzen diren denborazko serieen konplexutasunaren neurtze kuantitatiboa eman diezaguke

    Avalia??o da transformada de hilbert-huang na detec??o de desvios vocais

    Get PDF
    O dist?rbio vocal ? identificado por qualquer dificuldade ou altera??o na emiss?o vocal que dificulta a produ??o natural de voz, n?o cumprindo a transmiss?o da mensagem verbal e/ou emocional. Tais dist?rbios podem influenciar negativamente na qualidade de vida de um indiv?duo, podendo limitar a comunica??o no trabalho, como em outros aspectos sociais. O diagn?stico de uma altera??o vocal ? um processo que precisa combinar diversas t?cnicas de avalia??o e an?lise. As t?cnicas de an?lise de sinais em tempo-frequ?ncia s?o apropriadas para estudar os sinais biom?dicos, como a voz, pois s?o sinais que se caracterizam por apresentar conte?do relevante tanto no dom?nio do tempo quanto da frequ?ncia. Um m?todo para an?lise de sinais n?o lineares e n?o estacion?rios ? a transformada de Hilbert-Huang. Este estudo objetivou avaliar a aplicabilidade da transformada de Hilbert-Huang na detec??o de desvios vocais. Foram considerados dois estudos de caso para a aplica??o da transformada de Hilbert-Huang: 1) An?lise ac?stica de sinais de vozes saud?veis e desviadas (rugosidade, soprosidade e tens?o); e 2) An?lise ac?stica do grau de dist?rbios. A base de dados foi cedida pelo Laborat?rio Integrado de Estudos da Voz, do Departamento de Fonoaudiologia da Universidade Federal da Para?ba, da qual foram selecionados 116 sinais de voz de pessoas do sexo masculino e feminino com idade superior a 18 anos e inferior a 65 anos. Como etapas do processo, foram realizadas extra??o de caracter?sticas por meio da transformada de Hilbert-Huang; a avalia??o do potencial discriminativo por meio de an?lises estat?sticas atrav?s de testes de hip?teses; classifica??o dos sinais usando um classificador Multilayer Perceptron (MLP) com o algoritmo de aprendizado supervisionado do gradiente conjugado escalonado. Foi avaliado o desempenho do sistema de classifica??o e verificou-se que a caracter?stica individual com maior potencial discriminativo foi a amplitude instant?nea ponderada pela frequ?ncia instant?nea, extra?da na quinta fun??o intr?nseca de modo, discriminando vozes normais e desviadas, com acur?cia de 92,67% 4,52%. Quando as caracter?sticas foram combinadas, atingiu-se uma acur?cia de 100%, na discrimina??o entre os sinais de vozes normais e com desvio soprosidade e entre sinais de vozes normais e com desvio rugosidade. Diante dos resultados obtidos, sugere-se o uso da transformada de Hilbert-Huang por parte dos profissionais e pesquisadores na ?rea de voz, uma vez que por meio desta ? poss?vel auxiliar no processo de an?lise, classifica??o e tomada de decis?o cl?nica no diagn?stico e no tratamento de pacientes com dist?rbios vocais

    Unveiling the impact of neuromotor disorders on speech: a structured approach combining biomechanical fundamentals and statistical machine learning

    Get PDF
    Speech has been shown to convey clinically useful information in the study of Neurodegenerative Disorders (NDs), such as Parkinson’s Disease (PD). Traditionally the use of speech as an exploratory tool in People with Parkinson’s (PwP) has focused on the estimation of acoustic characteristics and their study at face value, analysing the physio-acoustical markers and using them as features for the differentiation between Healthy Controls (HC) and PwP. The present work takes a step further, given the intricate interoperation between neuromotor activity, responsible for both planning and driving the system, and the production of the acoustic speech signal; by the study of speech, this relationship may be properly exploited and analysed, providing a non-invasive method for the diagnosis, analysis, and observation of NDs. This work aims to introduce a working model that is capable of linking both domains and serves as a projection tool to provide insights about a speaker’s neuromotor state. This is based on a review of the neurophysiological background of the structure and function of the nervous system, and a review of the main nervous system dysfunctions involved in PD and other related neuromotor disorders. The role of the respiratory, phonatory, and articulatory systems is reviewed in the production of voice and speech under normal and pathological circumstances. This setting might allow for speech to be considered a useful trait within the precision medicine framework, as it provides a personal biometric marker that is innate and easy to elicit, can be recorded remotely with inexpensive equipment, is non-invasive, cost-effective, and easy to process. The problem can be divided into two main categories: firstly, a binary detection task distinguishing between healthy controls and individuals with NDs based on the projection model and phonatory estimates; secondly, a progression and tracking task providing a set of quantitative indices that enable clinically interpretable scores. This study aims to define a set of features and models that help to characterise hypokinetic dysarthria (HD). These incorporate the neuroscientific knowhow semantically and quantitatively to be used in clinical decision support tools that provide mechanistic insight on the processes involved in speech production, incorporating into the algorithmic element neuromotor considerations that add to better interpretability, consequently leading to improved clinical decisions and diagnosis. An overview of the acoustic signal processing algorithms for use in speech articulation and phonation system inversion regarding neuromotor disorder assessment is provided. An algorithmic methodology for model inversion and exploration has been proposed for the functional characterization and system inversion of each subsystem involved under the neuro-biomechanical foundations exposed before. A description of the vocal fold biomechanics using the glottal source, and formant dynamics provides the base for specific mapping to articulation kinematics. The statistical methods used in performance evaluation are based on three-way comparisons and transversal and longitudinal assessment by classical hypothesis testing. Three related experimental studies are shown to empirically illustrate the potential of phonation and articulation analysis: the characterization of PD from glottal biomechanics based on the amplitude distributions of the glottal flow and on the vocal fold body stiffness in assessing the efficiency of transcranial magnetic stimulation, and the description of PD dysarthria through an articulation projection model. The results from the biomechanical analysis of phonation showed that the behaviour of glottal source amplitude distributions from PD and healthy controls using three-way comparisons and hierarchical clustering were essentially distinguishable from those from normative young participants with the best accuracy scores produced by SVM classifiers of 94.8% (males) and 92.2% (females). Nevertheless, PD participants were barely separable from age-matched controls, possibly pointing to confounding factors due to age. The outcomes from using vocal fold stiffness in assessing the efficiency of transcranial magnetic stimulation showed mixed results, as some PD participants reflected clear improvements in phonation stability after stimulation, whereas some others did not. Some cases of sham controls experienced also minor improvements of unknown origin, possibly expressing a placebo effect. The overall results on the efficiency of stimulation showed an accuracy global score of 67% over the 18 cases studied. The results from articulation projection modelling showed the possibility of formulating personalised models for PD and control participants to transform acoustic formant dynamics into articulation kinematics. This might open the possibility of characterising PD dysarthria based on speech audio records. The most remarkable findings of the study include the determination of the glottal source amplitude distribution behaviour of normative and PD participants; the impact of age effects in phonation as a confounding factor in neuromotor disorder characterization; the importance of ensuring that the classification of speech dysarthria is based on principles that can be explained and interpreted; the need of taking into account the effects of medication when framing new classification experiments; the potential of using EEG-band decomposition to analyse vocal fold stiffness correlates, as well as the possibility of using these descriptions in longitudinal monitoring of treatment efficiency; the feasibility of establishing a relationship between acoustic and kinematic variables by projection model inversion; and the potential of these descriptions for estimating neuromotor activities in midbrain related to phonation and articulation activity. The most important outcome to be brought forth from the thesis is that the methodology used throughout the project uses a bottom-up approach based on speech model inversion at the acoustical, biomechanical, and neuromotor levels allowing to estimate glottal signals, biomechanical correlates, and neuromotor activity from speech alone, establishing a common neuromechanical characterisation framework on its own

    Robust and Complex Approach of Pathological Speech Signal Analysis

    No full text
    This article presents a~study of the approaches in the state-of-the-art in the field of pathological speech signal analysis with a~special focus on parametrization techniques. It provides a~description of 92 speech features where some of them are already widely used in this field of science and some of them have not been tried yet (they come from different areas of speech signal processing like speech recognition or coding). As an original contribution, this work introduces 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition. The significance of these features was tested on 3 (English, Spanish and Czech) pathological voice databases with respect to classification accuracy, sensitivity and specificity. To our best knowledge the introduced approach based on complex feature extraction and robust testing outperformed all works that have been published already in this field. The results (accuracy, sensitivity and specificity equal to 100.0±0.0%100.0\pm0.0\,\%) are discussable in the case of Massachusetts Eye and Ear Infirmary (MEEI) database because of its limitation related to a~length of sustained vowels, however in the case of Pr{\'i}ncipe de Asturias (PdA) Hospital in Alcal{\'a} de Henares of Madrid database we made improvements in classification accuracy (82.1±3.3%82.1\pm3.3\,\%) and specificity (83.8±5.1%83.8\pm5.1\,\%) when considering a~single-classifier approach. Hopefully, large improvements may be achieved in the case of Czech Parkinsonian Speech Database (PARCZ), which are discussed in this work as well. All the features introduced in this work were identified by Mann-Whitney~U test as significant (p < 0.05) when processing at least one of the mentioned databases. The largest discriminative power from these proposed features has a~cepstral peak prominence extracted from the first intrinsic mode function (p=6.94431032p = 6.9443\cdot10^{-32}) which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification. The article also mentions some ideas for the future work in the field of pathological speech signal analysis that can be valuable especially under the clinical point of view
    corecore