Search CORE

6 research outputs found

A Comprehensive Survey of Automatic Dysarthric Speech Recognition

Author: Chavan Shankar
Pophale Chandarani
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/08/2023
Field of study

Automatic dysarthric speech recognition (DSR) is very crucial for many human computer interaction systems that enables the human to interact with machine in natural way. The objective of this paper is to analyze the literature survey of various Machine learning (ML) and deep learning (DL) based dysarthric speech recognition systems (DSR). This article presents a comprehensive survey of the recent advances in the automatic Dysarthric Speech Recognition (DSR) using machine learning and deep learning paradigms. It focuses on the methodology, database, evaluation metrics and major findings from the study of previous approaches.The proposed survey presents the various challenges related with DSR such as individual variability, limited training data, contextual understanding, articulation variability, vocal quality changes, and speaking rate variations.From the literature survey it provides the gaps between exiting work and previous work on DSR and provides the future direction for improvement of DSR.&nbsp

International Journal on Recent and Innovation Trends in Computing and Communication

Modeling Sub-Band Information Through Discrete Wavelet Transform to Improve Intelligibility Assessment of Dysarthric Speech

Author: Pradhan Gayadhar
Sahu Laxmi Priya
Singh Jyoti Prakash
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 16/12/2022
Field of study

The speech signal within a sub-band varies at a fine level depending on the type, and level of dysarthria. The Mel-frequency filterbank used in the computation process of cepstral coefficients smoothed out this fine level information in the higher frequency regions due to the larger bandwidth of filters. To capture the sub-band information, in this paper, four-level discrete wavelet transform (DWT) decomposition is firstly performed to decompose the input speech signal into approximation and detail coefficients, respectively, at each level. For a particular input speech signal, five speech signals representing different sub-bands are then reconstructed using inverse DWT (IDWT). The log filterbank energies are computed by analyzing the short-term discrete Fourier transform magnitude spectra of each reconstructed speech using a 30-channel Mel-filterbank. For each analysis frame, the log filterbank energies obtained across all reconstructed speech signals are pooled together, and discrete cosine transform is performed to represent the cepstral feature, here termed as discrete wavelet transform reconstructed (DWTR)- Mel frequency cepstral coefficient (MFCC). The i-vector based dysarthric level assessment system developed on the universal access speech corpus shows that the proposed DTWRMFCC feature outperforms the conventional MFCC and several other cepstral features reported for a similar task. The usages of DWTR- MFCC improve the detection accuracy rate (DAR) of the dysarthric level assessment system in the text and the speaker-independent test case to 60.094 % from 56.646 % MFCC baseline. Further analysis of the confusion matrices shows that confusion among different dysarthric classes is quite different for MFCC and DWTR-MFCC features. Motivated by this observation, a two-stage classification approach employing discriminating power of both kinds of features is proposed to improve the overall performance of the developed dysarthric level assessment system. The two-stage classification scheme further improves the DAR to 65.813 % in the text and speaker- independent test case

Re-UNIR

Articulatory Knowledge in the Recognition of Dysarthric Speech

Author: Frank Rudzicz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Artificial Intelligence for Dysarthria Assessment in Children With Ataxia: A Hierarchical Approach

Author: Bertini Enrico
Bruschetta Roberta
Busa Mario
Castelli Enrico
Cerasa Antonio
Favetta Martina
Marino Flavia
Petrarca Maurizio
Pioggia Giovanni
Romano Alberto
Ruta Liliana
Schirinzi Tommaso
Summa Susanna
Tartarisco Gennaro
Vasco Gessica
Publication venue
Publication date: 13/12/2021
Field of study

Open Access Repository

Towards Automatic Speech-Language Assessment for Aphasia Rehabilitation

Author: Le Duc
Publication venue
Publication date: 01/01/2017
Field of study

Speech-based technology has the potential to reinforce traditional aphasia therapy through the development of automatic speech-language assessment systems. Such systems can provide clinicians with supplementary information to assist with progress monitoring and treatment planning, and can provide support for on-demand auxiliary treatment. However, current technology cannot support this type of application due to the difficulties associated with aphasic speech processing. The focus of this dissertation is on the development of computational methods that can accurately assess aphasic speech across a range of clinically-relevant dimensions. The first part of the dissertation focuses on novel techniques for assessing aphasic speech intelligibility in constrained contexts. The second part investigates acoustic modeling methods that lead to significant improvement in aphasic speech recognition and allow the system to work with unconstrained speech samples. The final part demonstrates the efficacy of speech recognition-based analysis in automatic paraphasia detection, extraction of clinically-motivated quantitative measures, and estimation of aphasia severity. The methods and results presented in this work will enable robust technologies for accurately recognizing and assessing aphasic speech, and will provide insights into the link between computational methods and clinical understanding of aphasia.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140840/1/ducle_1.pd

Deep Blue Documents at the University of Michigan