116 research outputs found

    Neurological Disease Detection and Monitoring from Voice Production

    Get PDF
    The dramatic impact of neurological degenerative pathologies in life quality is a growing concern. It is well known that many neurological diseases leave a fingerprint in voice and speech production. Many techniques have been designed for the detection, diagnose and monitoring the neurological disease. Most of them are costly or difficult to extend to primary attention medical services. Through the present paper it will be shown how some neurological diseases can be traced at the level of phonation. The detection procedure would be based on a simple voice test. The availability of advanced tools and methodologies to monitor the organic pathology of voice would facilitate the implantation of these tests. The paper hypothesizes that some of the underlying mechanisms affecting the production of voice produce measurable correlates in vocal fold biomechanics. A general description of the methodological foundations for the voice analysis system which can estimate correlates to the neurological disease is shown. Some study cases will be presented to illustrate the possibilities of the methodology to monitor neurological diseases by voic

    Estimating tremor in Vocal Fold Biomechanics for Neurological Disease characterisation

    Get PDF
    Neurological Diseases (ND) are affecting larger segments of aging population every year. Treatment is dependent on expensive accurate and frequent monitoring. It is well known that ND leave correlates in speech and phonation. The present work shows a method to detect alterations in vocal fold tension during phonation. These may appear either as hypertension or as cyclical tremor. Estimations of tremor may be produced by auto-regressive modeling of the vocal fold tension series in sustained phonation. The correlates obtained are a set of cyclicality coefficients, the frequency and the root mean square amplitude of the tremor. Statistical distributions of these correlates obtained from a set of male and female subjects are presented. Results from five study cases of female voice are also given

    Unveiling the impact of neuromotor disorders on speech: a structured approach combining biomechanical fundamentals and statistical machine learning

    Get PDF
    Speech has been shown to convey clinically useful information in the study of Neurodegenerative Disorders (NDs), such as Parkinson’s Disease (PD). Traditionally the use of speech as an exploratory tool in People with Parkinson’s (PwP) has focused on the estimation of acoustic characteristics and their study at face value, analysing the physio-acoustical markers and using them as features for the differentiation between Healthy Controls (HC) and PwP. The present work takes a step further, given the intricate interoperation between neuromotor activity, responsible for both planning and driving the system, and the production of the acoustic speech signal; by the study of speech, this relationship may be properly exploited and analysed, providing a non-invasive method for the diagnosis, analysis, and observation of NDs. This work aims to introduce a working model that is capable of linking both domains and serves as a projection tool to provide insights about a speaker’s neuromotor state. This is based on a review of the neurophysiological background of the structure and function of the nervous system, and a review of the main nervous system dysfunctions involved in PD and other related neuromotor disorders. The role of the respiratory, phonatory, and articulatory systems is reviewed in the production of voice and speech under normal and pathological circumstances. This setting might allow for speech to be considered a useful trait within the precision medicine framework, as it provides a personal biometric marker that is innate and easy to elicit, can be recorded remotely with inexpensive equipment, is non-invasive, cost-effective, and easy to process. The problem can be divided into two main categories: firstly, a binary detection task distinguishing between healthy controls and individuals with NDs based on the projection model and phonatory estimates; secondly, a progression and tracking task providing a set of quantitative indices that enable clinically interpretable scores. This study aims to define a set of features and models that help to characterise hypokinetic dysarthria (HD). These incorporate the neuroscientific knowhow semantically and quantitatively to be used in clinical decision support tools that provide mechanistic insight on the processes involved in speech production, incorporating into the algorithmic element neuromotor considerations that add to better interpretability, consequently leading to improved clinical decisions and diagnosis. An overview of the acoustic signal processing algorithms for use in speech articulation and phonation system inversion regarding neuromotor disorder assessment is provided. An algorithmic methodology for model inversion and exploration has been proposed for the functional characterization and system inversion of each subsystem involved under the neuro-biomechanical foundations exposed before. A description of the vocal fold biomechanics using the glottal source, and formant dynamics provides the base for specific mapping to articulation kinematics. The statistical methods used in performance evaluation are based on three-way comparisons and transversal and longitudinal assessment by classical hypothesis testing. Three related experimental studies are shown to empirically illustrate the potential of phonation and articulation analysis: the characterization of PD from glottal biomechanics based on the amplitude distributions of the glottal flow and on the vocal fold body stiffness in assessing the efficiency of transcranial magnetic stimulation, and the description of PD dysarthria through an articulation projection model. The results from the biomechanical analysis of phonation showed that the behaviour of glottal source amplitude distributions from PD and healthy controls using three-way comparisons and hierarchical clustering were essentially distinguishable from those from normative young participants with the best accuracy scores produced by SVM classifiers of 94.8% (males) and 92.2% (females). Nevertheless, PD participants were barely separable from age-matched controls, possibly pointing to confounding factors due to age. The outcomes from using vocal fold stiffness in assessing the efficiency of transcranial magnetic stimulation showed mixed results, as some PD participants reflected clear improvements in phonation stability after stimulation, whereas some others did not. Some cases of sham controls experienced also minor improvements of unknown origin, possibly expressing a placebo effect. The overall results on the efficiency of stimulation showed an accuracy global score of 67% over the 18 cases studied. The results from articulation projection modelling showed the possibility of formulating personalised models for PD and control participants to transform acoustic formant dynamics into articulation kinematics. This might open the possibility of characterising PD dysarthria based on speech audio records. The most remarkable findings of the study include the determination of the glottal source amplitude distribution behaviour of normative and PD participants; the impact of age effects in phonation as a confounding factor in neuromotor disorder characterization; the importance of ensuring that the classification of speech dysarthria is based on principles that can be explained and interpreted; the need of taking into account the effects of medication when framing new classification experiments; the potential of using EEG-band decomposition to analyse vocal fold stiffness correlates, as well as the possibility of using these descriptions in longitudinal monitoring of treatment efficiency; the feasibility of establishing a relationship between acoustic and kinematic variables by projection model inversion; and the potential of these descriptions for estimating neuromotor activities in midbrain related to phonation and articulation activity. The most important outcome to be brought forth from the thesis is that the methodology used throughout the project uses a bottom-up approach based on speech model inversion at the acoustical, biomechanical, and neuromotor levels allowing to estimate glottal signals, biomechanical correlates, and neuromotor activity from speech alone, establishing a common neuromechanical characterisation framework on its own

    PERIORAL BIOMECHANICS, KINEMATICS, AND ELECTROPHYSIOLOGY IN PARKINSON'S DISEASE

    Get PDF
    This investigation quantitatively characterized the orofacial biomechanics, labial kinematics, and associated electromyography (EMG) patterns in individuals with Parkinson's disease (PD) as a function of anti-PD medication state. Passive perioral stiffness, a clinical correlate of rigidity, was sampled using a face-referenced OroSTIFF system in 10 mildly diagnosed PD and 10 age/sex-matched control elderly. Labial movement amplitudes and velocities were evaluated using a 4-dimensional computerized motion capture system. Associated perioral EMG patterns were sampled to examine the characteristics of perioral muscles and compensatory muscular activation patterns during repetitive syllable productions. This study identified several trends that reflect various characteristics of perioral system differences between PD and control subjects: 1. The presence of high tonic EMG patterns after administration of dopaminergic treatment indicated an up-regulation of the central mechanism, which may serve to regulate orofacial postural control. 2. Multilevel regression modeling showed greater perioral stiffness in PD subjects, confirming the clinical correlate of rigidity in these patients. 3. Similar to the clinical symptoms in the upper and lower limb, a reduction of range of motion (hypokinesia) and velocity (bradykinesia) was evident in the PD orofacial system. Administration of dopaminergic treatment improved hypokinesia and bradykinesia. 4. A significant correlation was found between perioral stiffness and the range of labial movement, indicating these two symptoms may result in part from a common neural substrate. 5. As speech rate increased, PD speakers down-scaled movement amplitude and velocity compared to the control subjects, reflecting a compensatory mechanism to maintain target speech rates. 6. EMG from orbicularis oris inferior (OOIm) and depressor labii inferioris (DLIm) muscles revealed a limited range of muscle activation level in PD speakers, reflecting the underlying changes in motor unit firing behavior due to basal ganglia dysfunction. The results of this investigation provided a quantitative description of the perioral stiffness, labial kinematics, and EMG patterns in PD speakers. These findings indicate that perioral stiffness may provide clinicians a quantitative biomechanical correlate to medication response, movement aberrations, and EMG compensatory patterns in PD. The utilization of these objective assessments will be helpful in diagnosing, assessing, and monitoring the progression of PD to examine the efficacy of pharmacological, neurosurgical, and behavioral interventions

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis

    AUTOMATIC IDENTIFICATION OF DYSPHONIAS USING MACHINE LEARNING ALGORITHMS

    Get PDF
    Dysphonia is a prevalent symptom of some respiratory diseases that affects voice quality, even for prolonged periods. For its diagnosis, speech-language pathologists make use of different acoustic parameters to perform objective evaluations on patients and determine the type of dysphonia that affects them, such as hyperfunctional and hypofunctional dysphonia, which is important because each type requires a different treatment. In the field of artificial intelligence this problem has been addressed through the use of acoustic parameters that are used as input data to train machine learning and deep learning models. However, its purpose is usually to identify whether a patient is ill or not, making binary classifications between healthy voices and voices with dysphonia, but not between dysphonias. In this paper, harmonic-to-noise ratio, cepstral peak prominence-smoothed, zero crossing rate and the means of the Mel frequency cepstral coefficients (2-19) are used to make multiclass classification of voices with euphony, hyperfunction and hypofunction by means of six machine learning algorithms, which are: Random Forest, K nearest neighbors, Logistic regression, Decision trees, Support vector machines and Naive Bayes. In order to evaluate which of them presents a better performance to identify the three voice classes, bootstrap.632 was used. It is concluded that the best confidence interval ranges from 87% to 92%, in terms of accuracy for the K Nearest Neighbors model. Results can be implemented in the development of a complementary application for the clinical diagnosis or monitoring of a patient under the supervision of a specialist

    Comparing Lab-based and Telephone-based Speech Recordings Towards Parkinson's Assessment: Insights from Acoustic Analysis

    Get PDF
    • 

    corecore