10 research outputs found

    Latent variable modelling of population neuroimaging and behavioural data

    Get PDF
    Neuroimaging has aroused much interest in recent years due to the growth of Magnetic Resonance Imaging (MRI) technology and data acquisition techniques. This has led to an increase in interest for work that links neuroscience to behavioural research using neuroimaging data to reveal the interplay between brain and behaviours. Latent variable models are popular tools to investigate such relationships, with many studies exploring links between functional MRI and various behavioural and demographic measures. However, a common challenge is the interpretability of the latent variable models, in particular, their applications to large datasets with thousands of variables. In this thesis, we first introduced the basic concepts in neuroimaging and the challenges faced when linking it to behaviours. Then, we introduced the background methods applied in the thesis including latent variable models, predictive models and some widely applied data processing techniques. The discussion focused on clarifying easily confused and misused concepts, the theory and application of some rare model extensions, and the demonstration of crossvalidation in chained latent variable models. Many of these notes, to our knowledge, have not been discussed elsewhere. One of the main focuses and contributions of this thesis is the proposal of a dimension reduction method, namely Supervised Dimension Reduction. It aims to improve the interpretation of latent variable models, especially in the application of chaining multiple models together. We applied Supervised Dimension Reduction together with other latent variable models to the Human Connectome Project and the UK Biobank project to study the relationships between neuroimaging and behavioural data. We revealed many interesting patterns between brain and behaviours. Moreover, we further clarified the interpretation of a commonly applied latent variable model, Canonical Correlation Analysis. In particular, the multi-view extension and their applications in brain-behaviour study. In the end, we attempted to use functional MRI to predict a specific behavioural measure: personality. However, no results turned out to be significant under the analysis pipeline we applied

    Intelligent electrocardiogram acquisition via ubiquitous photoplethysmography monitoring

    Get PDF
    Recent advances in machine learning, particularly deep neural network architectures, have shown substantial promise in classifying and predicting cardiac abnormalities from electrocardiogram (ECG) data. Such data are rich in information content, typically in morphology and timing, due to the close correlation between cardiac function and the ECG. However, the ECG is usually not measured ubiquitously in a passive manner from consumer devices, and generally requires ‘active’ sampling whereby the user prompts a device to take an ECG measurement. Conversely, photoplethysmography (PPG) data are typically measured passively by consumer devices, and therefore available for long-period monitoring and suitable in duration for identifying transient cardiac events. However, classifying or predicting cardiac abnormalities from the PPG is very difficult, because it is a peripherally-measured signal. Hence, the use of the PPG for predictive inference is often limited to deriving physiological parameters (heart rate, breathing rate, etc.) or for obvious abnormalities in cardiac timing, such as atrial fibrillation/flutter (“palpitations”). This work aims to combine the best of both worlds: using continuously-monitored, near-ubiquitous PPG to identify periods of sufficient abnormality in the PPG such that prompting the user to take an ECG would be informative of cardiac risk. We propose a dual-convolutional-attention network (DCA-Net) to achieve this ECG-based PPG classification. With DCA-Net, we prove the plausibility of this concept on MIMIC Waveform Database with high performance level (AUROC > 0.9 and AUPRC > 0.7) and receive satisfactory result when testing the model on an independent dataset (AUROC > 0.7 and AUPRC > 0.6) which it is not perfectly-matched to the MIMIC dataset

    Patient clustering for vital organ failure using ICD code with graph attention

    Get PDF
    Objective: Heart failure, respiratory failure and kidney failure are three severe organ failures (OF) that have high mortalities and are most prevalent in intensive care units. The objective of this work is to offer insights into OF clustering from the aspects of graph neural networks and diagnosis history. Methods: This paper proposes a neural network-based pipeline to cluster three types of organ failure patients by incorporating embedding pre-train using an ontology graph of the International Classification of Diseases (ICD) codes. We employ an autoencoder-based deep clustering architecture jointly trained with a K-means loss, and a non-linear dimension reduction is performed to obtain patient clusters on the MIMIC-III dataset. Results: The clustering pipeline shows superior performance on a public-domain image dataset. On the MIMIC-III dataset, it discovers two distinct clusters that exhibit different comorbidity spectra which can be related to the severity of diseases. The proposed pipeline is compared with several other clustering models and shows superiority. Conclusion: Our proposed pipeline gives stable clusters, however, they do not correspond to the type of OF which indicates these OF share significant hidden characteristics in diagnosis. These clusters can be used to signal possible complications and severity of illness and aid personalised treatment. Significance: We are the first to apply an unsupervised approach to offer insights from a biomedical engineering perspective on these three types of organ failure, and publish the pre-trained embeddings for future transfer learning

    Improved interpretability of brain-behavior CCA with Domain-driven Dimension Reduction

    Get PDF
    Canonical Correlation Analysis (CCA) has been widely applied to study correlations between neuroimaging data and behavioral data. Practical use of CCA typically requires dimensionality reduction with, for example, Principal Components Analysis (PCA), however, this can result in CCA components that are difficult to interpret. In this paper, we introduce a Domain-driven Dimension Reduction (DDR) method, reducing the dimensionality of the original datasets and combining human knowledge of the structure of the variables studied. We apply the method to the Human Connectome Project S1200 release and compare standard PCA across all variables with DDR applied to individual classes of variables, finding that DDR-CCA results are more stable and interpretable, allowing the contribution of each class of variable to be better understood. By carefully designing the analysis pipeline and cross-validating the results, we offer more insights into the interpretation of CCA applied to brain-behavior data

    A medical multimodal large language model for future pandemics

    Get PDF
    Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Since most neural networks are supervised, their performance heavily depends on the volume and quality of available labels. However, few such labels exist for rare diseases (e.g., new pandemics). Here we report a medical multimodal large language model (Med-MLLM) for radiograph representation learning, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering a rare disease, our Med-MLLM can be rapidly deployed and easily adapted to them with limited labels. Furthermore, our model supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for clinical tasks that involve both visual and textual data. We demonstrate the effectiveness of our Med-MLLM by showing how it would perform using the COVID-19 pandemic “in replay”. In the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input data; 2) three kinds of downstream tasks, including disease reporting, diagnosis, and prognosis; 3) five COVID-19 datasets; and 4) three different languages, including English, Chinese, and Spanish. All experiments show that our model can make accurate and robust COVID-19 decision-support with little labelled data

    DuKA: A dual-keyless-attention model for multi-modality EHR data fusion and organ failure prediction

    Get PDF
    Objective: Organ failure is a leading cause of mortality in hospitals, particularly in intensive care units. Predicting organ failure is crucial for clinical and social reasons. This study proposes a dual-keyless-attention (DuKA) model that enables interpretable predictions of organ failure using electronic health record (EHR) data. Methods: Three modalities of medical data from EHR, namely diagnosis, procedure, and medications, are selected to predict three types of vital organ failures: heart failure, respiratory failure, and kidney failure. DuKA utilizes pre-trained embeddings of medical codes and combines them using a modality-wise attention module and a medical concept-wise attention module to enhance interpretation. Three organ failure tasks are addressed using two datasets to verify the effectiveness of DuKA. Results: The proposed multi-modality DuKA model outperforms all reference and baseline models. The diagnosis history, particularly the presence of cachexia and previous organ failure, emerges as the most influential feature in organ failure prediction. Conclusions: DuKA offers competitive performance, straightforward model interpretations and flexibility in terms of input sources, as the input embeddings can be trained using different datasets and methods. Significance: DuKA is a lightweight model that innovatively uses dual attention in a hierarchical way to fuse diagnosis, procedure and medication information for organ failure predictions. It also enhances disease comprehension and supports personalized treatment

    Intelligent Electrocardiogram Acquisition Via Ubiquitous Photoplethysmography Monitoring

    Get PDF
    Recent advances in machine learning, particularly deep neural network architectures, have shown substantial promise in classifying and predicting cardiac abnormalities from electrocardiogram (ECG) data. Such data are rich in information content, typically in morphology and timing, due to the close correlation between cardiac function and the ECG. However, the ECG is usually not measured ubiquitously in a passive manner from consumer devices, and generally requires &amp;#x2018;active&amp;#x2019; sampling whereby the user prompts a device to take an ECG measurement. Conversely, photoplethysmography (PPG) data are typically measured passively by consumer devices, and therefore available for long-period monitoring and suitable in duration for identifying transient cardiac events. However, classifying or predicting cardiac abnormalities from the PPG is very difficult, because it is a peripherally-measured signal. Hence, the use of the PPG for predictive inference is often limited to deriving physiological parameters (heart rate, breathing rate, etc.) or for obvious abnormalities in cardiac timing, such as atrial fibrillation/flutter (&amp;#x201C;palpitations&amp;#x201D;). This work aims to combine the best of both worlds: using continuously-monitored, near-ubiquitous PPG to identify periods of sufficient abnormality in the PPG such that prompting the user to take an ECG would be informative of cardiac risk. We propose a dual-convolutional-attention network (DCA-Net) to achieve this ECG-based PPG classification. With DCA-Net, we prove the plausibility of this concept on MIMIC Waveform Database with high performance level (AUROC &amp;gt; 0.9 and AUPRC &amp;gt; 0.7) and receive satisfactory result when testing the model on an independent dataset (AUROC &amp;gt; 0.7 and AUPRC &amp;gt; 0.6) which it is not perfectly-matched to the MIMIC dataset.</p

    A Deep Learning Approach for the Assessment of Signal Quality of Non-Invasive Foetal Electrocardiography

    No full text
    Non-invasive foetal electrocardiography (NI-FECG) has become an important prenatal monitoring method in the hospital. However, due to its susceptibility to non-stationary noise sources and lack of robust extraction methods, the capture of high-quality NI-FECG remains a challenge. Recording waveforms of sufficient quality for clinical use typically requires human visual inspection of each recording. A Signal Quality Index (SQI) can help to automate this task but, contrary to adult ECG, work on SQIs for NI-FECG is sparse. In this paper, a multi-channel signal quality classifier for NI-FECG waveforms is presented. The model can be used during the capture of NI-FECG to assist technicians to record high-quality waveforms, which is currently a labour-intensive task. A Convolutional Neural Network (CNN) is trained to distinguish between NI-FECG segments of high and low quality. NI-FECG recordings with one maternal channel and three abdominal channels were collected from 100 subjects during a routine hospital screening (102.6&nbsp;min of data). The model achieves an average 10-fold cross-validated AUC of 0.95&nbsp;&plusmn;&nbsp;0.02. The results show that the model can reliably assess the FECG signal quality on our dataset. The proposed model can improve the automated capture and analysis of NI-FECG as well as reduce technician labour time

    Uncertainties in the analysis of heart rate variability: a systematic review

    No full text
    Heart rate variability (HRV) is an important metric with a variety of applications in clinical situations such as cardiovascular diseases, diabetes mellitus, and mental health. HRV data can be potentially obtained from electrocardiography and photoplethysmography signals, then computational techniques such as signal filtering and data segmentation are used to process the sampled data for calculating HRV measures. However, uncertainties arising from data acquisition, computational models, and physiological factors can lead to degraded signal quality and affect HRV analysis. Therefore, it is crucial to address these uncertainties and develop advanced models for HRV analysis. Although several reviews of HRV analysis exist, they primarily focus on clinical applications, trends in HRV methods, or specific aspects of uncertainties such as measurement noise. This paper provides a comprehensive review of uncertainties in HRV analysis, quantifies their impacts, and outlines potential solutions. To the best of our knowledge, this is the first study that presents a holistic review of uncertainties in HRV methods and quantifies their impacts on HRV measures from an engineer's perspective. This review is essential for developing robust and reliable models, and could serve as a valuable future reference in the field, particularly for dealing with uncertainties in HRV analysis

    A medical multimodal large language model for future pandemics

    Get PDF
    Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Since most neural networks are supervised, their performance heavily depends on the volume and quality of available labels. However, few such labels exist for rare diseases (e.g., new pandemics). Here we report a medical multimodal large language model (Med-MLLM) for radiograph representation learning, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering a rare disease, our Med-MLLM can be rapidly deployed and easily adapted to them with limited labels. Furthermore, our model supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for clinical tasks that involve both visual and textual data. We demonstrate the effectiveness of our Med-MLLM by showing how it would perform using the COVID-19 pandemic “in replay”. In the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input data; 2) three kinds of downstream tasks, including disease reporting, diagnosis, and prognosis; 3) five COVID-19 datasets; and 4) three different languages, including English, Chinese, and Spanish. All experiments show that our model can make accurate and robust COVID-19 decision-support with little labelled data
    corecore