2 research outputs found
A Few-Shot Approach to Dysarthric Speech Intelligibility Level Classification Using Transformers
Dysarthria is a speech disorder that hinders communication due to
difficulties in articulating words. Detection of dysarthria is important for
several reasons as it can be used to develop a treatment plan and help improve
a person's quality of life and ability to communicate effectively. Much of the
literature focused on improving ASR systems for dysarthric speech. The
objective of the current work is to develop models that can accurately classify
the presence of dysarthria and also give information about the intelligibility
level using limited data by employing a few-shot approach using a transformer
model. This work also aims to tackle the data leakage that is present in
previous studies. Our whisper-large-v2 transformer model trained on a subset of
the UASpeech dataset containing medium intelligibility level patients achieved
an accuracy of 85%, precision of 0.92, recall of 0.8 F1-score of 0.85, and
specificity of 0.91. Experimental results also demonstrate that the model
trained using the 'words' dataset performed better compared to the model
trained on the 'letters' and 'digits' dataset. Moreover, the multiclass model
achieved an accuracy of 67%.Comment: Paper has been presented at ICCCNT 2023 and the final version will be
published in IEEE Digital Library Xplor
Enhancing Knee Osteoarthritis severity level classification using diffusion augmented images
This research paper explores the classification of knee osteoarthritis (OA)
severity levels using advanced computer vision models and augmentation
techniques. The study investigates the effectiveness of data preprocessing,
including Contrast-Limited Adaptive Histogram Equalization (CLAHE), and data
augmentation using diffusion models. Three experiments were conducted: training
models on the original dataset, training models on the preprocessed dataset,
and training models on the augmented dataset. The results show that data
preprocessing and augmentation significantly improve the accuracy of the
models. The EfficientNetB3 model achieved the highest accuracy of 84\% on the
augmented dataset. Additionally, attention visualization techniques, such as
Grad-CAM, are utilized to provide detailed attention maps, enhancing the
understanding and trustworthiness of the models. These findings highlight the
potential of combining advanced models with augmented data and attention
visualization for accurate knee OA severity classification.Comment: Paper has been accepted to be presented at ICACECS 2023 and the final
version will be published by Atlantis Highlights in Computer Science (AHCS) ,
Atlantis Press(part of Springer Nature