2,380 research outputs found

    Objective intelligibility assessment of pathological speakers

    Get PDF
    Intelligibility is a primary measure for the assessment of pathological speech. Traditionally, it is measured using a perceptual test, which is by definition subjective in nature. Consequently, there is a great interest in reliable, automatic and therefore objective methods. This paper presents such a method that incorporates an automatic speech recognizer (ASR) for producing features that characterize the pronunciations of a speaker and an intelligibility prediction model (IPM) for converting these features into an intelligibility score. High correlations (about 0.90) between objective and perceptual scores are obtained with a system comprising two different speech recognizers: one with traditional acoustic models relating acoustical observations to triphone states and one using phonological features as an intermediate layer between the acoustical observations and the phonetic states

    Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment

    Get PDF
    Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient's utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or make reading errors. To circumvent these problems, two different ASR-free methods were developed over the last few years, only making use of the acoustic or phonological properties of the utterance. In this paper, we demonstrate that these ASR-free techniques are also able to predict intelligibility in other languages. Moreover, they show to be complementary, resulting in even better intelligibility predictions when both methods are combined

    Simulating dysarthric speech for training data augmentation in clinical speech applications

    Full text link
    Training machine learning algorithms for speech applications requires large, labeled training data sets. This is problematic for clinical applications where obtaining such data is prohibitively expensive because of privacy concerns or lack of access. As a result, clinical speech applications are typically developed using small data sets with only tens of speakers. In this paper, we propose a method for simulating training data for clinical applications by transforming healthy speech to dysarthric speech using adversarial training. We evaluate the efficacy of our approach using both objective and subjective criteria. We present the transformed samples to five experienced speech-language pathologists (SLPs) and ask them to identify the samples as healthy or dysarthric. The results reveal that the SLPs identify the transformed speech as dysarthric 65% of the time. In a pilot classification experiment, we show that by using the simulated speech samples to balance an existing dataset, the classification accuracy improves by about 10% after data augmentation.Comment: Will appear in Proc. of ICASSP 201

    Cross-linguistic study of vocal pathology: perceptual features of spasmodic dysphonia in French-speaking subjects

    Get PDF
    Clinical characterisation of Spasmodic Dysphonia of the adductor type (SD) in French speakers by Klap and colleagues (1993) appears to differ from that of SD in English. This perceptual analysis aims to describe the phonetic features of French SD. A video of 6 French speakers with SD supplied by Klap and colleagues was analysed for frequency of phonatory breaks, pitch breaks, harshness, creak, breathiness and falsetto voice, rate of production, and quantity of speech output. In contrast to English SD, the French speaking SD patients demonstrated no evidence pitch breaks, but phonatory breaks, harshness and breathiness were prominent features. This verifies the French authors’ (1993) clinical description. These findings suggest that phonetic properties of a specific language may affect the manifestation of pathology in neurogenic voice disorders

    Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

    Full text link
    Speech intelligibility assessment plays an important role in the therapy of patients suffering from pathological speech disorders. Automatic and objective measures are desirable to assist therapists in their traditionally subjective and labor-intensive assessments. In this work, we investigate a novel approach for obtaining such a measure using the divergence in disentangled latent speech representations of a parallel utterance pair, obtained from a healthy reference and a pathological speaker. Experiments on an English database of Cerebral Palsy patients, using all available utterances per speaker, show high and significant correlation values (R = -0.9) with subjective intelligibility measures, while having only minimal deviation (+-0.01) across four different reference speaker pairs. We also demonstrate the robustness of the proposed method (R = -0.89 deviating +-0.02 over 1000 iterations) by considering a significantly smaller amount of utterances per speaker. Our results are among the first to show that disentangled speech representations can be used for automatic pathological speech intelligibility assessment, resulting in a reference speaker pair invariant method, applicable in scenarios with only few utterances available.Comment: Submitted to INTERSPEECH202

    DIA : a tool for objective intelligibility assessment of pathological speech

    Get PDF
    Intelligibility is generally accepted to be a very relevant measure in the assessment of pathological speech. In clinical practice, intelligibility is measured using one of the many existing perceptual tests. These tests usually have the drawback that they employ unnatural speech material (e.g. nonsense words) and that they cannot fully exclude errors due to the listener's bias. This raises the need for an objective and automated tool to measure intelligibility. Here, we present the Dutch Intelligibility Assessment (DIA), an objective tool that aids the speech therapist in evaluating the intelligibility of persons with pathological speech. This tool will soon be made publicly available
    corecore