285 research outputs found

    Direct subthalamic nucleus stimulation influences speech and voice quality in Parkinson's disease patients

    Get PDF
    BACKGROUND DBS of the subthalamic nucleus (STN) considerably ameliorates cardinal motor symptoms in PD. Reported STN-DBS effects on secondary dysarthric (speech) and dysphonic symptoms (voice), as originating from vocal tract motor dysfunctions, are however inconsistent with rather deleterious outcomes based on post-surgical assessments. OBJECTIVE To parametrically and intra-operatively investigate the effects of deep brain stimulation (DBS) on perceptual and acoustic speech and voice quality in Parkinson's disease (PD) patients. METHODS We performed an assessment of instantaneous intra-operative speech and voice quality changes in PD patients (n = 38) elicited by direct STN stimulations with variations of central stimulation features (depth, laterality, and intensity), separately for each hemisphere. RESULTS First, perceptual assessments across several raters revealed that certain speech and voice symptoms could be improved with STN-DBS, but this seems largely restricted to right STN-DBS. Second, computer-based acoustic analyses of speech and voice features revealed that both left and right STN-DBS could improve dysarthric speech symptoms, but only right STN-DBS can considerably improve dysphonic symptoms, with left STN-DBS being restricted to only affect voice intensity features. Third, several subareas according to stimulation depth and laterality could be identified in the motoric STN proper and close to the associative STN with optimal (and partly suboptimal) stimulation outcomes. Fourth, low-to-medium stimulation intensities showed the most optimal and balanced effects compared to high intensities. CONCLUSIONS STN-DBS can considerably improve both speech and voice quality based on a carefully arranged stimulation regimen along central stimulation features

    Differential specificity of acoustic measures to listener perception of voice quality

    Full text link
    The purpose of this project was to differentially examine the specificity of two acoustic measures, relative fundamental frequency (RFF) and the cepstral/spectral index of dysphonia (CSID), to listener perceptions of voice quality across four dimensions: breathiness, roughness, strain/vocal effort, and overall severity. An auditory perceptual experiment was conducted to estimate listener perception of said dimensions. The Pearson's correlation coefficient between RFF, CSID, and the perceptual ratings of voice quality was calculated in order to comment on the relationship between calculations of RFF and CSID and the current "gold standard" of listener perception. The hypothesis for this project was that measures of RFF would have a strong negative correlation with listener perception of strain/vocal effort, and that measures of CSID would have a strong positive correlation with listener perception of overall severity and breathiness. An unexpected result with a significant impact was found to be that listeners' ratings of the four voice qualities were highly correlated with one another. Unfortunately, the poorly differentiated perceptual ratings significantly impact the validity of this project in addition to hindering any reliability of its results. Thus overall, the correlations between measures of RFF, CSID, and distinct qualities of listener perception are rendered uninterpretable. Methodological considerations and future directions are henceforth reported

    Do We Get What We Need from Clinical Acoustic Voice Measurements?

    Full text link
    Instrumental acoustic measurements of the human voice have enormous potential to objectively describe pathology and, thereby, to assist clinical treatment decisions. Despite the increasing application and accessibility of technical knowledge and equipment, recent research has highlighted a lack of understanding of physiologic, speech/language-, and culture-related influencing factors. This article presents a critical review of the current state of the art in the clinical application of instrumental acoustic voice quality measurements and points out future directions for improving its applications and dissemination in less privileged populations. The main barriers to this research relate to (a) standardization and reporting of acoustic analysis techniques; (b) understanding of the relation between perceptual and instrumental acoustic results; (c) the necessity to account for natural speech-related covariables, such as differences in speaking voice sound pressure level (SPL) and fundamental frequency f0; (d) the need for a much larger database to understand normal variability within and between voice-disordered and vocally healthy individuals related to age, training, and physiologic factors; and (e) affordable equipment, including mobile communication devices, accessible in various settings. This calls for further research into technical developments and optimal assessment procedures for pathology-specific patient groups

    Optimization and automation of relative fundamental frequency for objective assessment of vocal hyperfunction

    Full text link
    The project objective is to improve clinical assessment and diagnosis of the voice disorder, vocal hyperfunction (VH). VH is a condition characterized by excessive laryngeal and paralaryngeal tension, and is assumed to be the underlying cause of the majority of voice disorders. Current clinical assessment of VH is subjective and demonstrates poor inter-rater reliability. Recent work indicates that a new acoustic measure, relative fundamental frequency (RFF) is sensitive to the maladaptive functional behaviors associated with VH and can potentially be used to objectively characterize VH. Here, we explored and enhanced the potential for RFF as a measure of VH in three ways. First, the current protocol for RFF estimation was optimized to simplify the recording procedure and reduce estimation time. Second, RFF was compared with the current state-of-the-art measures of VH – listener perception of vocal effort and the aerodynamic ratio of sound pressure level to subglottal pressure level. Third, an automated algorithm that utilized the optimized recording protocol was developed and validated against manual estimation methods and listener perception. This work enables large-scale studies on RFF to determine the specific physiological elements that contribute to the measure’s ability to capture VH and may potentially provide a non-invasive and readily implemented solution for this long-standing clinical issue

    Hemodynamics Study Based on Near-Infrared Optical Assessment

    Get PDF

    Multiple voice disorders in the same individual: Investigating handcrafted features, multi-label classification algorithms, and base-learners

    Get PDF
    Non-invasive acoustic analyses of voice disorders have been at the forefront of current biomedical research. Usual strategies, essentially based on machine learning (ML) algorithms, commonly classify a subject as being either healthy or pathologically-affected. Nevertheless, the latter state is not always a result of a sole laryngeal issue, i.e., multiple disorders might exist, demanding multi-label classification procedures for effective diagnoses. Consequently, the objective of this paper is to investigate the application of five multi-label classification methods based on problem transformation to play the role of base-learners, i.e., Label Powerset, Binary Relevance, Nested Stacking, Classifier Chains, and Dependent Binary Relevance with Random Forest (RF) and Support Vector Machine (SVM), in addition to a Deep Neural Network (DNN) from an algorithm adaptation method, to detect multiple voice disorders, i.e., Dysphonia, Laryngitis, Reinke's Edema, Vox Senilis, and Central Laryngeal Motion Disorder. Receiving as input three handcrafted features, i.e., signal energy (SE), zero-crossing rates (ZCRs), and signal entropy (SH), which allow for interpretable descriptors in terms of speech analysis, production, and perception, we observed that the DNN-based approach powered with SE-based feature vectors presented the best values of F1-score among the tested methods, i.e., 0.943, as the averaged value from all the balancing scenarios, under Saarbrücken Voice Database (SVD) and considering 20% of balancing rate with Synthetic Minority Over-sampling Technique (SMOTE). Finally, our findings of most false negatives for laryngitis may explain the reason why its detection is a serious issue in speech technology. The results we report provide an original contribution, allowing for the consistent detection of multiple speech pathologies and advancing the state-of-the-art in the field of handcrafted acoustic-based non-invasive diagnosis of voice disorders

    Using dysphonic voice to characterize speaker's biometry

    Get PDF
    Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading

    A systematic review and narrative analysis of digital speech biomarkers in Motor Neuron Disease

    Get PDF
    Motor Neuron Disease (MND) is a progressive and largely fatal neurodegeneritve disorder with a lifetime risk of approximately 1 in 300. At diagnosis, up to 25% of people with MND (pwMND) exhibit bulbar dysfunction. Currently, pwMND are assessed using clinical examination and diagnostic tools including the ALS Functional Rating Scale Revised (ALS-FRS(R)), a clinician-administered questionnaire with a single item on speech intelligibility. Here we report on the use of digital technologies to assess speech features as a marker of disease diagnosis and progression in pwMND. Google Scholar, PubMed, Medline and EMBASE were systematically searched. 40 studies were evaluated including 3670 participants; 1878 with a diagnosis of MND. 24 studies used microphones, 5 used smartphones, 6 used apps, 2 used tape recorders and 1 used the Multi-Dimensional Voice Programme (MDVP) to record speech samples. Data extraction and analysis methods varied but included traditional statistical analysis, CSpeech, MATLAB and machine learning (ML) algorithms. Speech features assessed also varied and included jitter, shimmer, fundamental frequency, intelligible speaking rate, pause duration and syllable repetition. Findings from this systematic review indicate that digital speech biomarkers can distinguish pwMND from healthy controls and can help identify bulbar involvement in pwMND. Preliminary evidence suggests digitally assessed acoustic features can identify more nuanced changes in those affected by voice dysfunction. No one digital speech biomarker alone is consistently able to diagnose or prognosticate MND. Further longitudinal studies involving larger samples are required to validate the use of these technologies as diagnostic tools or prognostic biomarkers
    corecore