Search CORE

939 research outputs found

HEALTH OUTCOME PATHWAY PREDICTION. A GRAPH-BASED FRAMEWORK

Author: Cardoso Miguel Simão Dórdio
Publication venue
Publication date: 01/02/2022
Field of study

This dissertation is part of the project FrailCare.AI, which aims to detect frailty in the elderly Portuguese population in order to optimize the SNS24 (telemonitoring) service, with the goal of suggesting health pathways to reduce the patients frailty. Frailty can be defined as the condition of being weak and delicate which normally increases with age and is the consequence of several health and non-health related factors. A patient health journey is recorded in Eletronic Health Record (EHR), which are rich but sparse, noisy and multi-modal sources of truth. These can be used to train predictive models to predict future health states, where frailty is just one of them. In this work, due to lack of data access we pivoted our focus to phenotype prediction, that is, predicting diagnosis. What is more, we tackle the problem of data-insufficiency and class imbalance (e.g. rare diseases and other infrequent occurrences in the training data) by integrating standardized healthcare ontologies within graph neural networks. We study the broad task of phenotype prediction, multi-task scenarios and as well few-shot scenarios - which is when a class rarely occurs in the training set. Furthermore, during the development of this work we detect some reproducibility issues in related literature which we detail, and also open-source all of our implementations introduding a framework to aid the development of similar systems.A presente dissertação insere-se no projecto FrailCare.AI, que visa detectar a fragilidade da população idosa portuguesa com o objectivo de optimizar o serviço de telemonitoriza- ção do Sistema Nacional de Saúde Português (SNS24), e também sugerir acções a tomar para reduzir a fragilidade dos doentes. A fragilidade é uma condição de risco composta por multiplos fatores. Hoje em dia, grande parte da história clinica de cada utente é gravada digitalmente. Estes dados diversos e vastos podem ser usados treinar modelos preditivos cujo objectivo é prever futuros estados de saúde, sendo que fragilidade é só um deles. Devido à falta de accesso a dados, alteramos a tarefa principal deste trabalho para previsão de diágnosticos, onde exploramos o problema de insuficiência de dados e dese- quilíbrio de classes (por exemplo, doenças raras e outras ocorrências pouco frequentes nos dados de treino), integrando ontologias de conceitos médicos por meio de redes neu- ronais de gráfos. Exploramos também outras tarefas e o impacto que elas têm entre si. Para além disso, durante o desenvolvimento desta dissertação identificamos questões a nivel de reproducibilidade da literatura estudada, onde detalhamos e implementamos os conceitos em falta. Com o objectivo de reproducibilidade em mente, nós libertamos o nosso código, introduzindo um biblioteca que permite desenvlver sistemas semelhantes ao nosso

Bidirectional Representation Learning from Transformers using Multimodal Electronic Health Record Data to Predict Depression

Author: Arnold Corey W.
Meng Yiwen
Ong Michael K.
Speier William
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2021
Field of study

Advancements in machine learning algorithms have had a beneficial impact on representation learning, classification, and prediction models built using electronic health record (EHR) data. Effort has been put both on increasing models' overall performance as well as improving their interpretability, particularly regarding the decision-making process. In this study, we present a temporal deep learning model to perform bidirectional representation learning on EHR sequences with a transformer architecture to predict future diagnosis of depression. This model is able to aggregate five heterogenous and high-dimensional data sources from the EHR and process them in a temporal manner for chronic disease prediction at various prediction windows. We applied the current trend of pretraining and fine-tuning on EHR data to outperform the current state-of-the-art in chronic disease prediction, and to demonstrate the underlying relation between EHR codes in the sequence. The model generated the highest increases of precision-recall area under the curve (PRAUC) from 0.70 to 0.76 in depression prediction compared to the best baseline model. Furthermore, the self-attention weights in each sequence quantitatively demonstrated the inner relationship between various codes, which improved the model's interpretability. These results demonstrate the model's ability to utilize heterogeneous EHR data to predict depression while achieving high accuracy and interpretability, which may facilitate constructing clinical decision support systems in the future for chronic disease screening and early detection.Comment: in IEEE Journal of Biomedical and Health Informatics (2021

arXiv.org e-Print Archive

eScholarship - University of California