752 research outputs found

    Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks

    Full text link
    We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements. An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question. We encode participants' eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images. The trained classifier is used to predict participants' perceived-relevance of news articles from the corresponding scanpath images. This method is content-independent, as the classifier does not require knowledge of the screen-content, or the user's information-task. Even with little data, the image classifier can predict perceived-relevance with up to 80% accuracy. When compared to similar eye-tracking studies from the literature, this scanpath image classification method outperforms previously reported metrics by appreciable margins. We also attempt to interpret how the image classifier differentiates between scanpaths on relevant and irrelevant documents

    Interpretable global-local dynamics for the prediction of eye fixations in autonomous driving scenarios

    Get PDF
    Human eye movements while driving reveal that visual attention largely depends on the context in which it occurs. Furthermore, an autonomous vehicle which performs this function would be more reliable if its outputs were understandable. Capsule Networks have been presented as a great opportunity to explore new horizons in the Computer Vision field, due to their capability to structure and relate latent information. In this article, we present a hierarchical approach for the prediction of eye fixations in autonomous driving scenarios. Context-driven visual attention can be modeled by considering different conditions which, in turn, are represented as combinations of several spatio-temporal features. With the aim of learning these conditions, we have built an encoder-decoder network which merges visual features' information using a global-local definition of capsules. Two types of capsules are distinguished: representational capsules for features and discriminative capsules for conditions. The latter and the use of eye fixations recorded with wearable eye tracking glasses allow the model to learn both to predict contextual conditions and to estimate visual attention, by means of a multi-task loss function. Experiments show how our approach is able to express either frame-level (global) or pixel-wise (local) relationships between features and contextual conditions, allowing for interpretability while maintaining or improving the performance of black-box related systems in the literature. Indeed, our proposal offers an improvement of 29% in terms of information gain with respect to the best performance reported in the literature.The authors would like to thank the authors from DR(eye)VE Project [49] for the support provided during this work, as well as the Multimedia Processing Group from the Universidad Carlos III de Madrid for their entire personal and academic implication

    Human Movement Disorders Analysis with Graph Neural Networks

    Get PDF
    Human movement disorders encompass a group of neurological conditions that cause abnormal movements. These disorders, even when subtle, may be symptomatic of a broad spectrum of medical issues, from neurological to musculoskeletal. Clinicians and researchers still encounter challenges in understanding the underlying pathologies. In light of this, medical professionals and associated researchers are increasingly looking towards the fast-evolving domain of computer vision in pursuit of precise and dependable automated diagnostic tools to support clinical diagnosis. To this end, this thesis explores the feasibility of the interpretable and accurate human movement disorders analysis system using graph neural networks. Cerebral Palsy (CP) and Parkinson’s Disease (PD) are two common neurological diseases associated with movement disorders that seriously affect patients’ quality of life. Specifically, CP is estimated to affect 2 in 1000 babies born in the UK each year, while PD affects an estimated 10 million people globally. Considering their clinical significance and properties, we develop and examine the state-of-the-art attention-informed Graph Neural Networks (GNN) for robust and interpretable CP prediction and PD diagnosis. We highlight the significant differences between the human body movement frequency of CP infants and healthy groups, and propose frequency attention-informed convolutional networks (GCNs) and spatial frequency attention based GCNs to predict CP with strong interpretability. To support the early diagnosis of PD, we propose novel video-based deep learning system, SPA-PTA, with a spatial pyramidal attention design based on clinical observations and mathematical theories. Our systems provide undiagnosed PD patients with low-cost, non-intrusive PT classification and tremor severity rating results as a PD warning sign with interpretable attention visualizations

    Explainable deep learning solutions for the artifacts correction of EEG signals

    Get PDF
    L'attività celebrale può essere acquisita tramite elettroencefalografia (EEG) con degli elettrodi posti sullo scalpo del soggetto. Quando un segnale EEG viene acquisito si formano degli artefatti dovuti a: movimenti dei muscoli, movimenti degli occhi, attività del cuore o dovuti all'apparecchio di acquisizione stesso. Questi artefatti possono notevolmente compromettere la qualità dei segnali EEG. La rimozione di questi artefatti è fondamentale per molte discipline per ottenere un segnale pulito e poterlo utilizzare nel migli0re dei modi. Il machine learning (ML) è un esempio di tecnica che può essere utilizzata per classificare e rimuovere gli artefatti dai segnali EEG. Il deep learning (DL) è una branca del ML che è sviluppata ispirandosi all'architettura della corteccia cerebrale umana. Il DL è alla base della creazione dell'intelligenza artificiale e della costruzione di reti neurali (NN). Nella tesi applicheremo ICLabel che è una rete neurale che classifica le componenti indipendenti (IC), ottenute con la scomposizione tramite independent component analysis (ICA), in sette classi differenti: brain, eye, muscle, heart, channel noise, line noise e other. ICLabel calcola la probabilità che le ICs appartengano a ciascuna di queste sette classi. Durante questo lavoro di tesi abbiamo sviluppato una semplice rete neurale, simile a quella di ICLabel, che classifica le ICs in due classi: una contenente le ICs che corrispondono a quelli che sono i segnali base dell'attività cerebrale, l'altra invece contenente le ICs che non appartengono a questi segnali base. Abbiamo creato questa rete neurale per poter applicare poi un algoritmo di explainability (basato sulle reti neurali), chiamato GradCAM. Abbiamo, poi, comparato le performances di ICLabel e della rete neurale da noi sviluppata per vedere le differenze dal punto di vista della accuratezza e della precisione nella classificazione, come descritto nel capitolo. Abbiamo infine applicato GradCAM alla rete neurale da noi sviluppata per capire quali parti del segnale la rete usa per compiere le classificazioni, evidenziando sugli spettrogrammi delle ICs le parti più importanti del segnale. Possiamo dire poi, che come ci aspettavamo la CNN è guidata da componenti come quelle del line noise (che corrisponde alla frequenza di 50 Hz e armoniche più alte) per identificare le componenti non brain, mentre si concentra sul range da 1-30 Hz per identificare quelle brain. Anche se promettenti questi risultati vannno investigati. Inoltre GradCAM potrebbe essere applicato anche su ICLabel per spiegare la sua struttura più complessa.The brain electrical activity can be acquired via electroencephalography (EEG) with electrodes placed on the scalp of the individual. When EEG signals are recorded, signal artifacts such as muscular activities, blinking of eyes, and power line electrical noise can significantly affect the quality of the EEG signals. Machine learning (ML) techniques are an example of method used to classify and remove EEG artifacts. Deep learning is a type of ML inspired by the architecture of the cerebral cortex, that is formed by a dense network of neurons, simple processing units in our brain. In this thesis work we use ICLabel that is an artificial neural network developed by EEGLAB to automatically classify, that classifies the inidpendent component(ICs), obtained by the application of the independent component analysis (ICA), in seven classes, i.e., brain, eye, muscle, heart, channel noise, line noise, other. ICLabel provides the probability that each IC features belongs to one out of 6 artefact classes, or it is a pure brain component. We create a simple CNN similar to the ICLabel's one that classifies the EEG artifacts ICs in two classes, brain and not brain. and we added an explainability tool, i.e., GradCAM, to investigate how the algorithm is able to successfully classify the ICs. We compared the performances f our simple CNN versus those of ICLabel, finding that CNN is able to reach satisfactory accuracies (over two classes, i.e., brain/non-brain). Then we applied GradCAM to the CNN to understand what are the most important parts of the spectrogram that the network used to classify the data and we could speculate that, as expected, the CNN is driven by components such as the power line noise (50 Hz and higher harmonics) to identify non-brain components, while it focuses on the range 1-30 Hz to identify brain components. Although promising, these results need further investigations. Moreover, GradCAM could be later applied to ICLabel, too, in order to explain the more sophisticated DL model with 7 classes

    Decoding sensorimotor information from superior parietal lobule of macaque via Convolutional Neural Networks

    Get PDF
    Despite the well-recognized role of the posterior parietal cortex (PPC) in processing sensory information to guide action, the differential encoding properties of this dynamic processing, as operated by different PPC brain areas, are scarcely known. Within the monkey's PPC, the superior parietal lobule hosts areas V6A, PEc, and PE included in the dorso-medial visual stream that is specialized in planning and guiding reaching movements. Here, a Convolutional Neural Network (CNN) approach is used to investigate how the information is processed in these areas. We trained two macaque monkeys to perform a delayed reaching task towards 9 positions (distributed on 3 different depth and direction levels) in the 3D peripersonal space. The activity of single cells was recorded from V6A, PEc, PE and fed to convolutional neural networks that were designed and trained to exploit the temporal structure of neuronal activation patterns, to decode the target positions reached by the monkey. Bayesian Optimization was used to define the main CNN hyper-parameters. In addition to discrete positions in space, we used the same network architecture to decode plausible reaching trajectories. We found that data from the most caudal V6A and PEc areas outperformed PE area in the spatial position decoding. In all areas, decoding accuracies started to increase at the time the target to reach was instructed to the monkey, and reached a plateau at movement onset. The results support a dynamic encoding of the different phases and properties of the reaching movement differentially distributed over a network of interconnected areas. This study highlights the usefulness of neurons' firing rate decoding via CNNs to improve our understanding of how sensorimotor information is encoded in PPC to perform reaching movements. The obtained results may have implications in the perspective of novel neuroprosthetic devices based on the decoding of these rich signals for faithfully carrying out patient's intentions.(C) 2022 Published by Elsevier Ltd

    Digital Oculomotor Biomarkers in Dementia

    Get PDF
    Dementia is an umbrella term that covers a number of neurodegenerative syndromes featuring gradual disturbance of various cognitive functions that are severe enough to interfere with tasks of daily life. The diagnosis of dementia occurs frequently when pathological changes have been developing for years, symptoms of cognitive impairment are evident and the quality of life of the patients has already been deteriorated significantly. Although brain imaging and fluid biomarkers allow the monitoring of disease progression in vivo, they are expensive, invasive and not necessarily diagnostic in isolation. Recent studies suggest that eye-tracking technology is an innovative tool that holds promise for accelerating early detection of the disease, as well as, supporting the development of strategies that minimise impairment during every day activities. However, the optimal methods for quantitative evaluation of oculomotor behaviour during complex and naturalistic tasks in dementia have yet to be determined. This thesis investigates the development of computational tools and techniques to analyse eye movements of dementia patients and healthy controls under naturalistic and less constrained scenarios to identify novel digital oculomotor biomarkers. Three key contributions are made. First, the evaluation of the role of environment during navigation in patients with typical Alzheimer disease and Posterior Cortical Atrophy compared to a control group using a combination of eye movement and egocentric video analysis. Secondly, the development of a novel method of extracting salient features directly from the raw eye-tracking data of a mixed sample of dementia patients during a novel instruction-less cognitive test to detect oculomotor biomarkers of dementia-related cognitive dysfunction. Third, the application of unsupervised anomaly detection techniques for visualisation of oculomotor anomalies during various cognitive tasks. The work presented in this thesis furthers our understanding of dementia-related oculomotor dysfunction and gives future research direction for the development of computerised cognitive tests and ecological interventions

    Deep Interpretability Methods for Neuroimaging

    Get PDF
    Brain dynamics are highly complex and yet hold the key to understanding brain function and dysfunction. The dynamics captured by resting-state functional magnetic resonance imaging data are noisy, high-dimensional, and not readily interpretable. The typical approach of reducing this data to low-dimensional features and focusing on the most predictive features comes with strong assumptions and can miss essential aspects of the underlying dynamics. In contrast, introspection of discriminatively trained deep learning models may uncover disorder-relevant elements of the signal at the level of individual time points and spatial locations. Nevertheless, the difficulty of reliable training on high-dimensional but small-sample datasets and the unclear relevance of the resulting predictive markers prevent the widespread use of deep learning in functional neuroimaging. In this dissertation, we address these challenges by proposing a deep learning framework to learn from high-dimensional dynamical data while maintaining stable, ecologically valid interpretations. The developed model is pre-trainable and alleviates the need to collect an enormous amount of neuroimaging samples to achieve optimal training. We also provide a quantitative validation module, Retain and Retrain (RAR), that can objectively verify the higher predictability of the dynamics learned by the model. Results successfully demonstrate that the proposed framework enables learning the fMRI dynamics directly from small data and capturing compact, stable interpretations of features predictive of function and dysfunction. We also comprehensively reviewed deep interpretability literature in the neuroimaging domain. Our analysis reveals the ongoing trend of interpretability practices in neuroimaging studies and identifies the gaps that should be addressed for effective human-machine collaboration in this domain. This dissertation also proposed a post hoc interpretability method, Geometrically Guided Integrated Gradients (GGIG), that leverages geometric properties of the functional space as learned by a deep learning model. With extensive experiments and quantitative validation on MNIST and ImageNet datasets, we demonstrate that GGIG outperforms integrated gradients (IG), which is considered to be a popular interpretability method in the literature. As GGIG is able to identify the contours of the discriminative regions in the input space, GGIG may be useful in various medical imaging tasks where fine-grained localization as an explanation is beneficial

    Interpretable Convolutional Neural Networks for Decoding and Analyzing Neural Time Series Data

    Get PDF
    Machine learning is widely adopted to decode multi-variate neural time series, including electroencephalographic (EEG) and single-cell recordings. Recent solutions based on deep learning (DL) outperformed traditional decoders by automatically extracting relevant discriminative features from raw or minimally pre-processed signals. Convolutional Neural Networks (CNNs) have been successfully applied to EEG and are the most common DL-based EEG decoders in the state-of-the-art (SOA). However, the current research is affected by some limitations. SOA CNNs for EEG decoding usually exploit deep and heavy structures with the risk of overfitting small datasets, and architectures are often defined empirically. Furthermore, CNNs are mainly validated by designing within-subject decoders. Crucially, the automatically learned features mainly remain unexplored; conversely, interpreting these features may be of great value to use decoders also as analysis tools, highlighting neural signatures underlying the different decoded brain or behavioral states in a data-driven way. Lastly, SOA DL-based algorithms used to decode single-cell recordings rely on more complex, slower to train and less interpretable networks than CNNs, and the use of CNNs with these signals has not been investigated. This PhD research addresses the previous limitations, with reference to P300 and motor decoding from EEG, and motor decoding from single-neuron activity. CNNs were designed light, compact, and interpretable. Moreover, multiple training strategies were adopted, including transfer learning, which could reduce training times promoting the application of CNNs in practice. Furthermore, CNN-based EEG analyses were proposed to study neural features in the spatial, temporal and frequency domains, and proved to better highlight and enhance relevant neural features related to P300 and motor states than canonical EEG analyses. Remarkably, these analyses could be used, in perspective, to design novel EEG biomarkers for neurological or neurodevelopmental disorders. Lastly, CNNs were developed to decode single-neuron activity, providing a better compromise between performance and model complexity

    Unveiling Key Features: A Comparative Study of Machine Learning Models for Alzheimer\u27s Detection

    Get PDF
    This thesis rigorously evaluates the application of an array of natural language processing (NLP) techniques and machine learning models to identify linguistic signatures indicative of dementia, as sourced from the DementiaBank Pitt corpus. Utilizing a binary classification paradigm, this study meticulously integrates sophisticated embedding methods—including Doc2Vec, Word2Vec, GloVe, and BERT—with traditional machine learning algorithms such as Random Forest, Multinomial Naïve Bayes, ADA boost, KNN classifier, and Logistic Regression, alongside deep learning architectures like LSTM, Bi-LSTM, and CNN-LSTM. The efficacy of these methodologies is evaluated based on their capacity to differentiate between transcribed speech impacted by dementia and that from control subjects. To enhance interpretability, this research also employs feature importance analysis through LIME, SHAP, permutation importance, and integrated gradients, shedding light on the variables most instrumental in driving model predictions. The results of this comprehensive analysis not only illuminate the robust potential of these combined NLP and machine learning approaches in the context of medical screening but also contribute additional valuable insights to the field of NLP and dementia screening specifically
    corecore