7 research outputs found
Explainable Graph Neural Network for Alzheimer's Disease And Related Dementias Risk Prediction
Alzheimer's disease and related dementias (ADRD) ranks as the sixth leading
cause of death in the US, underlining the importance of accurate ADRD risk
prediction. While recent advancement in ADRD risk prediction have primarily
relied on imaging analysis, yet not all patients undergo medical imaging before
an ADRD diagnosis. Merging machine learning with claims data can reveal
additional risk factors and uncover interconnections among diverse medical
codes. Our goal is to utilize Graph Neural Networks (GNNs) with claims data for
ADRD risk prediction. Addressing the lack of human-interpretable reasons behind
these predictions, we introduce an innovative method to evaluate relationship
importance and its influence on ADRD risk prediction, ensuring comprehensive
interpretation.
We employed Variationally Regularized Encoder-decoder Graph Neural Network
(VGNN) for estimating ADRD likelihood. We created three scenarios to assess the
model's efficiency, using Random Forest and Light Gradient Boost Machine as
baselines. We further used our relation importance method to clarify the key
relationships for ADRD risk prediction. VGNN surpassed other baseline models by
10% in the area under the receiver operating characteristic. The integration of
the GNN model and relation importance interpretation could potentially play an
essential role in providing valuable insight into factors that may contribute
to or delay ADRD progression.
Employing a GNN approach with claims data enhances ADRD risk prediction and
provides insights into the impact of interconnected medical code relationships.
This methodology not only enables ADRD risk modeling but also shows potential
for other image analysis predictions using claims data
Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions
Graph representation learning (GRL) has emerged as a pivotal field that has
contributed significantly to breakthroughs in various fields, including
biomedicine. The objective of this survey is to review the latest advancements
in GRL methods and their applications in the biomedical field. We also
highlight key challenges currently faced by GRL and outline potential
directions for future research.Comment: Accepted by 2023 IMIA Yearbook of Medical Informatic
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications
Multimodality Representation Learning, as a technique of learning to embed
information from different modalities and their correlations, has achieved
remarkable success on a variety of applications, such as Visual Question
Answering (VQA), Natural Language for Visual Reasoning (NLVR), and Vision
Language Retrieval (VLR). Among these applications, cross-modal interaction and
complementary information from different modalities are crucial for advanced
models to perform any multimodal task, e.g., understand, recognize, retrieve,
or generate optimally. Researchers have proposed diverse methods to address
these tasks. The different variants of transformer-based architectures
performed extraordinarily on multiple modalities. This survey presents the
comprehensive literature on the evolution and enhancement of deep learning
multimodal architectures to deal with textual, visual and audio features for
diverse cross-modal and modern multimodal tasks. This study summarizes the (i)
recent task-specific deep learning methodologies, (ii) the pretraining types
and multimodal pretraining objectives, (iii) from state-of-the-art pretrained
multimodal approaches to unifying architectures, and (iv) multimodal task
categories and possible future improvements that can be devised for better
multimodal learning. Moreover, we prepare a dataset section for new researchers
that covers most of the benchmarks for pretraining and finetuning. Finally,
major challenges, gaps, and potential research topics are explored. A
constantly-updated paperlist related to our survey is maintained at
https://github.com/marslanm/multimodality-representation-learning
Representation learning for electronic health records
Vse večji in kompleksnejši nabori podatkov, zbrani v klinični praksi, zahtevajo razvoj učinkovitih metod, namenjenih odkrivanju koristnega znanja. V delu proučimo uspešnost štirih skupin modelov naraščajočih kompleksnosti za ekstrakcijo in uporabo koristnega znanja iz elektronskih zdravstvenih kartotek. Učenje izvedemo z uporabo podatkov v obliki nestrukturiranega kliničnega teksta, z uporabo rezultata algoritma Wordification, s katerim podatke v relacijski podatkovni zbirki predstavimo v obliki dokumentov, in s hkratno uporabo obeh tipov podatkov. Modele evalviramo na nalogah napovedovanja pomembnih kliničnih dogodkov z uporabo referenčne zbirke elektronskih zdravstvenih kartotek MIMIC-III. Najprej evalviramo modele na osnovi klasifikacije agregiranih vložitev besed in dokumentov, ki služijo kot izhodišče za vrednotenje kompleksnejših modelov. Nadaljujemo z evalvacijo modela na osnovi konvolucijskih nevronskih mrež in modela na osnovi arhitekture BERT. Na koncu evalviramo ansamble najuspešnejših modelov prejšnjih skupin, ki agregirajo znanje vsebovano v kliničnem tekstu in rezultatih algoritma Wordification. Rezultati nakazujejo, da lahko z uporabo rezultatov algoritma Wordification naučimo modele, ki so konkurenčni različicam, naučenim z bolje raziskano uporabo kliničnega teksta. Ansambelske modele, ki hkrati izrabljajo oba tipa podatkov, na podlagi uporabljenih metrik ovrednotimo kot najuspešnejše.The growing size and complexity of data collected in clinical practice necessitate the development of efficient methods for discovering the knowledge they contain. We examine the performance of four groups of models of increasing complexity for extracting and utilizing useful knowledge from electronic health records. The models were trained using unstructured clinical text, the relational dataset converted into a document-like form with Wordification, and using both types of data simultaneously. We evaluate the models on the task of predicting important clinical events using the reference MIMIC-III collection of electronic health records. We start by evaluating the models based document classification and aggregated word embeddings. The results serve as the baseline for evaluating models of higher complexity. We next evaluate a model based on convolutional neural networks and a model based on the BERT architecture. Finally, we evaluate ensembles of best-performing models from the previous groups that aggregate the knowledge extracted from clinical text and results of Wordification. The results suggest that models trained using the results of Wordification can compete with models trained using the better-studied approach of utilizing clinical text. Ensemble models that simultaneously exploit both data types are the best performers based on the metrics used