Search CORE

2,380 research outputs found

Objective intelligibility assessment of pathological speakers

Author: De Bodt Marc
Martens Jean-Pierre
Middag Catherine
Van Nuffelen Gwen
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2008
Field of study

Intelligibility is a primary measure for the assessment of pathological speech. Traditionally, it is measured using a perceptual test, which is by definition subjective in nature. Consequently, there is a great interest in reliable, automatic and therefore objective methods. This paper presents such a method that incorporates an automatic speech recognizer (ASR) for producing features that characterize the pronunciations of a speaker and an intelligibility prediction model (IPM) for converting these features into an intelligibility score. High correlations (about 0.90) between objective and perceptual scores are obtained with a system comprising two different speech recognizers: one with traditional acoustic models relating acoustical observations to triphone states and one using phonological features as an intermediate layer between the acoustical observations and the phonetic states

Ghent University Academic Bibliography

Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment

Author: Bocklet Tobias
Martens Jean-Pierre
Middag Catherine
Nöth Elmar
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2011
Field of study

Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient's utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or make reading errors. To circumvent these problems, two different ASR-free methods were developed over the last few years, only making use of the acoustic or phonological properties of the utterance. In this paper, we demonstrate that these ASR-free techniques are also able to predict intelligibility in other languages. Moreover, they show to be complementary, resulting in even better intelligibility predictions when both methods are combined

Ghent University Academic Bibliography

Simulating dysarthric speech for training data augmentation in clinical speech applications

Author: Berisha Visar
Jiao Yishan
Liss Julie
Tu Ming
Publication venue
Publication date: 26/04/2018
Field of study

Training machine learning algorithms for speech applications requires large, labeled training data sets. This is problematic for clinical applications where obtaining such data is prohibitively expensive because of privacy concerns or lack of access. As a result, clinical speech applications are typically developed using small data sets with only tens of speakers. In this paper, we propose a method for simulating training data for clinical applications by transforming healthy speech to dysarthric speech using adversarial training. We evaluate the efficacy of our approach using both objective and subjective criteria. We present the transformed samples to five experienced speech-language pathologists (SLPs) and ask them to identify the samples as healthy or dysarthric. The results reveal that the SLPs identify the transformed speech as dysarthric 65% of the time. In a pilot classification experiment, we show that by using the simulated speech samples to balance an existing dataset, the classification accuracy improves by about 10% after data augmentation.Comment: Will appear in Proc. of ICASSP 201

arXiv.org e-Print Archive

Crossref

Dysarthria Intelligibility Assessment in a Factor Analysis Total Variability Space

Author: Christensen H.
Green P.
Martinez David
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

Cross-linguistic study of vocal pathology: perceptual features of spasmodic dysphonia in French-speaking subjects

Author: BRIN M. F.
CRYSTAL D. A.
DARLEY F.
DELATTRE P.
DENES P. B.
KLAP P.
LANGEN DE
LAVER J.
MALECOT A.
MARJORIE PERLMAN LORCH
RENATA WHURR
WHURR R.
WHURR R.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2003
Field of study

Clinical characterisation of Spasmodic Dysphonia of the adductor type (SD) in French speakers by Klap and colleagues (1993) appears to differ from that of SD in English. This perceptual analysis aims to describe the phonetic features of French SD. A video of 6 French speakers with SD supplied by Klap and colleagues was analysed for frequency of phonatory breaks, pitch breaks, harshness, creak, breathiness and falsetto voice, rate of production, and quantity of speech output. In contrast to English SD, the French speaking SD patients demonstrated no evidence pitch breaks, but phonatory breaks, harshness and breathiness were prominent features. This verifies the French authors’ (1993) clinical description. These findings suggest that phonetic properties of a specific language may affect the manifestation of pathology in neurogenic voice disorders

Crossref

Birkbeck Institutional Research Online

Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

Author: Heismann Bjoern
Klumpp Philipp
Maier Andreas
Noeth Elmar
Schuster Maria
Weise Tobias
Yang Seung Hee
Publication venue
Publication date: 08/04/2022
Field of study

Speech intelligibility assessment plays an important role in the therapy of patients suffering from pathological speech disorders. Automatic and objective measures are desirable to assist therapists in their traditionally subjective and labor-intensive assessments. In this work, we investigate a novel approach for obtaining such a measure using the divergence in disentangled latent speech representations of a parallel utterance pair, obtained from a healthy reference and a pathological speaker. Experiments on an English database of Cerebral Palsy patients, using all available utterances per speaker, show high and significant correlation values (R = -0.9) with subjective intelligibility measures, while having only minimal deviation (+-0.01) across four different reference speaker pairs. We also demonstrate the robustness of the proposed method (R = -0.89 deviating +-0.02 over 1000 iterations) by considering a significantly smaller amount of utterances per speaker. Our results are among the first to show that disentangled speech representations can be used for automatic pathological speech intelligibility assessment, resulting in a reference speaker pair invariant method, applicable in scenarios with only few utterances available.Comment: Submitted to INTERSPEECH202

arXiv.org e-Print Archive

Recommended from our members

Efficacy of speech intervention using electropalatography with a cochlear implant user

Author: Herman R.
Pantelemidou V.
Thomas J.
Publication venue: 'Informa UK Limited'
Publication date: 01/06/2003
Field of study

Electropalatography (EPG) has become relatively well established as a safe and convenient technique for use in the assessment, diagnosis and treatment of children and adults with articulation disorders. EPG's wide applicability is reflected in the range of different cases that has been researched in recent years. Some research has been carried out using EPG therapy for deaf individuals who use hearing aids, however there are no similar studies for cochlear implant users. The purpose of this single case study is to explore the technique of EPG as a therapeutic intervention to treat voiceless velar stop consonant sound production in a deaf child cochlear implant user. EPG therapy was offered as a last resort when traditional therapy failed to achieve specific changes. During therapy, a list of familiar words was practised, using the visual feedback provided by EPG. The client's articulation was assessed using objective (EPG printouts) and subjective (listener ratings) measures at four assessment points. Changes were found to be statistically significant. Generalization of the newly‐acquired skills to untaught words containing voiceless velars was also observed. The results are discussed in the broader context of implications of this type of therapy with deaf clients

City Research Online

Crossref

DIA : a tool for objective intelligibility assessment of pathological speech

Author: De Bodt Marc
Martens Jean-Pierre
Middag Catherine
Van Nuffelen Gwen
Publication venue: 'Firenze University Press'
Publication date: 01/01/2009
Field of study

Intelligibility is generally accepted to be a very relevant measure in the assessment of pathological speech. In clinical practice, intelligibility is measured using one of the many existing perceptual tests. These tests usually have the drawback that they employ unnatural speech material (e.g. nonsense words) and that they cannot fully exclude errors due to the listener's bias. This raises the need for an objective and automated tool to measure intelligibility. Here, we present the Dutch Intelligibility Assessment (DIA), an objective tool that aids the speech therapist in evaluating the intelligibility of persons with pathological speech. This tool will soon be made publicly available

Ghent University Academic Bibliography