74 research outputs found

    Acoustic-phonetic decoding for speech intelligibility evaluation in the context of Head and Neck Cancers

    Get PDF
    International audienceIn addition to health problems, Head and Neck Cancers (HNC) can cause serious speech disorders that can lead to partial or complete loss of speech intel-ligibility in some patients. The clinician's evaluation of the intelligibility level before or after surgical treatment and / or during the rehabilitation phase is an important part of the clinical assessment. Perceptive assessment is the most widely used method in clinical practice to assess the level of intelligibility of a patient despite the limitations associated with it such as subjectivity and moderate reproducibility. In this paper, we propose to overcome these limitations by associating a specific task of speech production based on pseudo-words with an automatic speech processing system, both oriented towards acoustic-phonetic decoding. Compared to human perception, the automatic system reaches very high correlation rates and promising results when applied to a French speech corpus including 41 healthy speakers and 85 patients suffering from HNC

    Towards Automatic Speech-Language Assessment for Aphasia Rehabilitation

    Full text link
    Speech-based technology has the potential to reinforce traditional aphasia therapy through the development of automatic speech-language assessment systems. Such systems can provide clinicians with supplementary information to assist with progress monitoring and treatment planning, and can provide support for on-demand auxiliary treatment. However, current technology cannot support this type of application due to the difficulties associated with aphasic speech processing. The focus of this dissertation is on the development of computational methods that can accurately assess aphasic speech across a range of clinically-relevant dimensions. The first part of the dissertation focuses on novel techniques for assessing aphasic speech intelligibility in constrained contexts. The second part investigates acoustic modeling methods that lead to significant improvement in aphasic speech recognition and allow the system to work with unconstrained speech samples. The final part demonstrates the efficacy of speech recognition-based analysis in automatic paraphasia detection, extraction of clinically-motivated quantitative measures, and estimation of aphasia severity. The methods and results presented in this work will enable robust technologies for accurately recognizing and assessing aphasic speech, and will provide insights into the link between computational methods and clinical understanding of aphasia.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140840/1/ducle_1.pd

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies

    JDReAM. Journal of InterDisciplinary Research Applied to Medicine - Vol. 4, issue 2 (2020)

    Get PDF

    JDReAM. Journal of InterDisciplinary Research Applied to Medicine - Vol. 4, issue 2 (2020)

    Get PDF

    Dysarthric speech analysis and automatic recognition using phase based representations

    Get PDF
    Dysarthria is a neurological speech impairment which usually results in the loss of motor speech control due to muscular atrophy and poor coordination of articulators. Dysarthric speech is more difficult to model with machine learning algorithms, due to inconsistencies in the acoustic signal and to limited amounts of training data. This study reports a new approach for the analysis and representation of dysarthric speech, and applies it to improve ASR performance. The Zeros of Z-Transform (ZZT) are investigated for dysarthric vowel segments. It shows evidence of a phase-based acoustic phenomenon that is responsible for the way the distribution of zero patterns relate to speech intelligibility. It is investigated whether such phase-based artefacts can be systematically exploited to understand their association with intelligibility. A metric based on the phase slope deviation (PSD) is introduced that are observed in the unwrapped phase spectrum of dysarthric vowel segments. The metric compares the differences between the slopes of dysarthric vowels and typical vowels. The PSD shows a strong and nearly linear correspondence with the intelligibility of the speaker, and it is shown to hold for two separate databases of dysarthric speakers. A systematic procedure for correcting the underlying phase deviations results in a significant improvement in ASR performance for speakers with severe and moderate dysarthria. In addition, information encoded in the phase component of the Fourier transform of dysarthric speech is exploited in the group delay spectrum. Its properties are found to represent disordered speech more effectively than the magnitude spectrum. Dysarthric ASR performance was significantly improved using phase-based cepstral features in comparison to the conventional MFCCs. A combined approach utilising the benefits of PSD corrections and phase-based features was found to surpass all the previous performance on the UASPEECH database of dysarthric speech

    Malay articulation system for early screening diagnostic using hidden markov model and genetic algorithm

    Get PDF
    Speech recognition is an important technology and can be used as a great aid for individuals with sight or hearing disabilities today. There are extensive research interest and development in this area for over the past decades. However, the prospect in Malaysia regarding the usage and exposure is still immature even though there is demand from the medical and healthcare sector. The aim of this research is to assess the quality and the impact of using computerized method for early screening of speech articulation disorder among Malaysian such as the omission, substitution, addition and distortion in their speech. In this study, the statistical probabilistic approach using Hidden Markov Model (HMM) has been adopted with newly designed Malay corpus for articulation disorder case following the SAMPA and IPA guidelines. Improvement is made at the front-end processing for feature vector selection by applying the silence region calibration algorithm for start and end point detection. The classifier had also been modified significantly by incorporating Viterbi search with Genetic Algorithm (GA) to obtain high accuracy in recognition result and for lexical unit classification. The results were evaluated by following National Institute of Standards and Technology (NIST) benchmarking. Based on the test, it shows that the recognition accuracy has been improved by 30% to 40% using Genetic Algorithm technique compared with conventional technique. A new corpus had been built with verification and justification from the medical expert in this study. In conclusion, computerized method for early screening can ease human effort in tackling speech disorders and the proposed Genetic Algorithm technique has been proven to improve the recognition performance in terms of search and classification task

    ACOUSTIC SPEECH MARKERS FOR TRACKING CHANGES IN HYPOKINETIC DYSARTHRIA ASSOCIATED WITH PARKINSON’S DISEASE

    Get PDF
    Previous research has identified certain overarching features of hypokinetic dysarthria associated with Parkinson’s Disease and found it manifests differently between individuals. Acoustic analysis has often been used to find correlates of perceptual features for differential diagnosis. However, acoustic parameters that are robust for differential diagnosis may not be sensitive to tracking speech changes. Previous longitudinal studies have had limited sample sizes or variable lengths between data collection. This study focused on using acoustic correlates of perceptual features to identify acoustic markers able to track speech changes in people with Parkinson’s Disease (PwPD) over six months. The thesis presents how this study has addressed limitations of previous studies to make a novel contribution to current knowledge. Speech data was collected from 63 PwPD and 47 control speakers using an online podcast software at two time points, six months apart (T1 and T2). Recordings of a standard reading passage, minimal pairs, sustained phonation, and spontaneous speech were collected. Perceptual severity ratings were given by two speech and language therapists for T1 and T2, and acoustic parameters of voice, articulation and prosody were investigated. Two analyses were conducted: a) to identify which acoustic parameters can track perceptual speech changes over time and b) to identify which acoustic parameters can track changes in speech intelligibility over time. An additional attempt was made to identify if these parameters showed group differences for differential diagnosis between PwPD and control speakers at T1 and T2. Results showed that specific acoustic parameters in voice quality, articulation and prosody could differentiate between PwPD and controls, or detect speech changes between T1 and T2, but not both factors. However, specific acoustic parameters within articulation could detect significant group and speech change differences across T1 and T2. The thesis discusses these results, their implications, and the potential for future studies
    • …
    corecore