Deep Learning-based Cognitive Impairment Diseases Prediction and Assistance using Multimodal Data

Abstract

In this project, we propose a mobile robot-based system capable of analyzing data from elderly people and patients with cognitive impairment diseases, such as aphasia or dementia. The project entails the deployment of two primary tasks that will be performed by the robot. The first task is the detection of these diseases in their early stages to initiate professional treatment, thereby improving the patient's quality of life. The other task focuses on automatic emotion detection, particularly during interactions with other people, in this case, clinicians. Additionally, the project aims to examine how the combination of different modalities, such as audio or text, can influence the model's results. Extensive research has been conducted on various dementia and aphasia datasets, as well as the implemented tasks. For this purpose, we utilized the DementiaBank and AphasiaBank datasets, which contain multimodal data in different formats, including video, audio, and audio transcriptions. We employed diverse models for the prediction task, including Convolutional Neural Networks for audio classification, Transformers for text classification, and a multimodal model combining both approaches. These models underwent testing on a separate test set, and the best results were achieved using the text modality, achieving a 90.36% accuracy in detecting dementia. Additionally, we conducted a detailed analysis of the available data to explain the obtained results and the model's explainability. The pipeline for automatic emotion recognition was evaluated by manually reviewing initial frames of one hundred randomly selected video samples from the dataset. This pipeline was also employed to recognize emotions in both healthy patients, and those with aphasia. The study revealed that individuals with aphasia express different emotional moods than healthy ones when listening to someone's speech, primarily due to their difficulties in understanding and expressing speech. Due to this, it negatively impacts their mood. Analyzing their emotional state can facilitate improved interactions by avoiding conversations that may have a negative impact on their mood, thus providing better assistance

    Similar works