5 research outputs found
Developing Explainable Deep Learning Models Using EEG for Brain Machine Interface Systems
Deep learning (DL) based decoders for Brain-Computer-Interfaces (BCI) using Electroencephalography (EEG) have gained immense popularity recently. However, the interpretability of DL models remains an under-explored area. This thesis aims to develop and validate computational neuroscience approaches to make DL models more robust and explainable. First, a simulation framework was developed to evaluate the robustness and sensitivity of twelve back-propagation-based visualization methods. Comparing to ground truth features, after randomizing model weights and labels, multiple methods had reliability issues: e.g., the gradient approach, which is the most used visualization technique in EEG, was not class or model-specific. Overall, DeepLift was the most reliable and robust method. Second, we demonstrated how model explanations combined with a clustering approach can be used to complement the analysis of DL models applied to measured EEG in three tasks. In the first task, DeepLift identified the EEG spatial patterns associated with hand motor imagery in a data-driven manner from a database of 54 individuals. Explanations identified different strategies used by individuals and exposed the issues in limiting decoding to the sensorimotor channels. The clustering approach improved the decoding in high-performing subjects. In the second task, we used GradCAM to explain the Convolutional Neural Network’s (CNN) decision associated with detecting balance perturbations while wearing an exoskeleton, deployable for fall prevention. Perturbation evoked potentials (PEP) in EEG (∼75 ms) preceded both the peak in electromyography (∼180 ms) and the center of pressure (∼350 ms). Explanation showed that the model utilized electro-cortical components in the PEP and was not driven by artifacts. Explanations aligned with dynamic functional connectivity measures and prior studies supporting the feasibility of using BCI-exoskeleton systems for fall prevention. In the third task, the susceptibility of DL models to eyeblink artifacts was evaluated. The frequent presence of blinks (in 50% trials or more), whether they bias a particular class or not, leads to a significant difference in decoding when using CNN. In conclusion, the thesis contributes towards improving the BCI decoders using DL models by using model explanation approaches. Specific recommendations and best practices for the use of back-propagation-based visualization methods for BCI decoder design are discussed
An empirical comparison of deep learning explainability approaches for EEG using simulated ground truth
Abstract Recent advancements in machine learning and deep learning (DL) based neural decoders have significantly improved decoding capabilities using scalp electroencephalography (EEG). However, the interpretability of DL models remains an under-explored area. In this study, we compared multiple model explanation methods to identify the most suitable method for EEG and understand when some of these approaches might fail. A simulation framework was developed to evaluate the robustness and sensitivity of twelve back-propagation-based visualization methods by comparing to ground truth features. Multiple methods tested here showed reliability issues after randomizing either model weights or labels: e.g., the saliency approach, which is the most used visualization technique in EEG, was not class or model-specific. We found that DeepLift was consistently accurate as well as robust to detect the three key attributes tested here (temporal, spatial, and spectral precision). Overall, this study provides a review of model explanation methods for DL-based neural decoders and recommendations to understand when some of these methods fail and what they can capture in EEG
Multi-Output Sequential Deep Learning Model for Athlete Force Prediction on a Treadmill Using 3D Markers
Reliable and innovative methods for estimating forces are critical aspects of biomechanical sports research. Using them, athletes can improve their performance and technique and reduce the possibility of fractures and other injuries. For this purpose, throughout this project, we proceeded to research the use of video in biomechanics. To refine this method, we propose an RNN trained on a biomechanical dataset of regular runners that measures both kinematics and kinetics. The model will allow analyzing, extracting, and drawing conclusions about continuous variable predictions through the body. It marks different anatomical and reflective points (96 in total, 32 per dimension) that will allow the prediction of forces (N) in three dimensions (Fx, Fy, Fz), measured on a treadmill with a force plate at different velocities (2.5 m/s, 3.5 m/s, 4.5 m/s). In order to obtain the best model, a grid search of different parameters that combined various types of layers (Simple, GRU, LSTM), loss functions (MAE, MSE, MSLE), and sampling techniques (down-sampling, up-sampling) helped obtain the best performing model (LSTM, MSE, down-sampling) achieved an average coefficient of determination of 0.68, although when excluding Fz it reached 0.92
Multi-Output Sequential Deep Learning Model for Athlete Force Prediction on a Treadmill Using 3D Markers
Reliable and innovative methods for estimating forces are critical aspects of biomechanical sports research. Using them, athletes can improve their performance and technique and reduce the possibility of fractures and other injuries. For this purpose, throughout this project, we proceeded to research the use of video in biomechanics. To refine this method, we propose an RNN trained on a biomechanical dataset of regular runners that measures both kinematics and kinetics. The model will allow analyzing, extracting, and drawing conclusions about continuous variable predictions through the body. It marks different anatomical and reflective points (96 in total, 32 per dimension) that will allow the prediction of forces (N) in three dimensions (Fx, Fy, Fz), measured on a treadmill with a force plate at different velocities (2.5 m/s, 3.5 m/s, 4.5 m/s). In order to obtain the best model, a grid search of different parameters that combined various types of layers (Simple, GRU, LSTM), loss functions (MAE, MSE, MSLE), and sampling techniques (down-sampling, up-sampling) helped obtain the best performing model (LSTM, MSE, down-sampling) achieved an average coefficient of determination of 0.68, although when excluding Fz it reached 0.92