4,020 research outputs found
Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data
Acute Kidney Injury (AKI), a sudden decline in kidney function, is associated
with increased mortality, morbidity, length of stay, and hospital cost. Since
AKI is sometimes preventable, there is great interest in prediction. Most
existing studies consider all patients and therefore restrict to features
available in the first hours of hospitalization. Here, the focus is instead on
rehospitalized patients, a cohort in which rich longitudinal features from
prior hospitalizations can be analyzed. Our objective is to provide a risk
score directly at hospital re-entry. Gradient boosting, penalized logistic
regression (with and without stability selection), and a recurrent neural
network are trained on two years of adult inpatient EHR data (3,387 attributes
for 34,505 patients who generated 90,013 training samples with 5,618 cases and
84,395 controls). Predictions are internally evaluated with 50 iterations of
5-fold grouped cross-validation with special emphasis on calibration, an
analysis of which is performed at the patient as well as hospitalization level.
Error is assessed with respect to diagnosis, race, age, gender, AKI
identification method, and hospital utilization. In an additional experiment,
the regularization penalty is severely increased to induce parsimony and
interpretability. Predictors identified for rehospitalized patients are also
reported with a special analysis of medications that might be modifiable risk
factors. Insights from this study might be used to construct a predictive tool
for AKI in rehospitalized patients. An accurate estimate of AKI risk at
hospital entry might serve as a prior for an admitting provider or another
predictive algorithm.Comment: In revisio
Visual Analytics of Electronic Health Records with a focus on Acute Kidney Injury
The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports the needs of healthcare providers. Another reason is the information overload, which makes healthcare providers often misunderstand, misinterpret, ignore, or overlook vital data. The emergence of a type of computational system known as visual analytics (VA), has the potential to reduce the complexity of EHR data by combining advanced analytics techniques with interactive visualizations to analyze, synthesize, and facilitate high-level activities while allowing users to get more involved in a discourse with the data. The purpose of this research is to demonstrate the use of sophisticated visual analytics systems to solve various EHR-related research problems. This dissertation includes a framework by which we identify gaps in existing EHR-based systems and conceptualize the data-driven activities and tasks of our proposed systems. Two novel VA systems (VISA_M3R3 and VALENCIA) and two studies are designed to bridge the gaps. VISA_M3R3 incorporates multiple regression, frequent itemset mining, and interactive visualization to assist users in the identification of nephrotoxic medications. Another proposed system, VALENCIA, brings a wide range of dimension reduction and cluster analysis techniques to analyze high-dimensional EHRs, integrate them seamlessly, and make them accessible through interactive visualizations. The studies are conducted to develop prediction models to classify patients who are at risk of developing acute kidney injury (AKI) and identify AKI-associated medication and medication combinations using EHRs. Through healthcare administrative datasets stored at the ICES-KDT (Kidney Dialysis and Transplantation program), London, Ontario, we have demonstrated how our proposed systems and prediction models can be used to solve real-world problems
Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records
Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks
Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding
Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation
Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness
Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations.
This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing rates using multiple subsets with lower missing rates. The second method introduces the integration of medical knowledge graphs and double attention mechanism with the long short-term memory (LSTM) model to enhance interpretability by providing knowledge-based model interpretation. The third method develops an LSTM variant that integrates medical knowledge graphs and additional time-aware gates to handle multi-variable temporal missing issues and interpretability concerns. Finally, a transformer-based model is proposed to learn unbiased and fair representations of diverse subpopulations using domain classifiers and three attention mechanisms
Prognostic prediction models using Self-Attention for ICU patients developing acute kidney injury
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2022The general growth and improved accessibility to electronic health records demands an identical level of
progress in terms of the research community regarding clinical models. The usage of machine learning
techniques is key to this development, and so they are increasingly being used in large medical databases
with the purpose of creating solutions that work for specified patients, no matter the task or the disease.
Acute kidney injury (AKI) is a broad disease defined by abrupt changes in renal function. AKI has
a high morbidity and mortality, with an increased focus on critically ill patients. The main goal of this
thesis is to study the development of AKI within a patient’s stay in the intensive care unit (ICU).
Data from the MIMIC-III database was used to collect information regarding the patients. After a
detailed exclusion criteria, those were evaluated in terms of AKI stages, with the purpose of predicting the
next value of AKI stage one hour after the sequence of information fed to the model. This can suggest the
capacity of the model at predicting the aggravation of a patient’s AKI condition. The sequences used have
hourly information for every feature, and were used sequences of 6h, 12h and 24h length. Self-attention
mechanisms were used to make the predictions, using an adaptation for multi-variate time series built
from the successfully used models on natural language processing (NLP) tasks.
The predictions on this work were made for two variations of the KDIGO classification system: one
where only the serum creatinine (SCr) criteria was taken into account to determine the patient’s AKI
stage, and other where both SCr and urine output (UO) were considered. While most works addressing
AKI only tend to use SCr values to determine the patient’s AKI condition, the results were compared
using both approaches and were better when using both SCr and UO. For those experiments, the model
achieved up to 68.05% accuracy predicting an episode of AKI, compared to the 66.67% accuracy achieved
using only SCr values, which outperformed state-of-the-art results for both cases.
Feature importance was also used for each dataset associated with the two variations of KDIGO
classification system to identify what were the most important features. Furthermore, final results were
compared when using all features versus only using the most 10 important ones
Integrating Real Time Data to Improve Outcomes in Acute Kidney Injury
Critically ill patients with acute kidney injury requiring renal replacement therapy have a poor prognosis. Despite well-known factors, which contribute to outcomes, including dose delivery, patients frequently miss the target dose and volume removal. One major barrier to effective care of these patients is the traditional dissociation of dialysis device data from other clinical information systems, notably the electronic health record (EHR). This lack of integration and the resulting manual documentation leads to errors and biases in documentation and missed opportunities to intervene in a timely fashion. This review summarizes the technological advancements facilitating direct connection of dialysis devices to EHRs. This connection facilitates automated data capture of many variables - including delivered dose, ultrafiltration rate and pressure measurements - which in turn can be leveraged for data mining, quality improvement and real-time targeted therapy adjustments. These interventions hold the promise to significantly improve outcomes for this patient population
Utilizing electronic health records to predict acute kidney injury risk and outcomes: Workgroup statements from the 15<sup>th</sup> ADQI Consensus Conference
The data contained within the electronic health record (EHR) is "big" from the standpoint of volume, velocity, and variety. These circumstances and the pervasive trend towards EHR adoption have sparked interest in applying big data predictive analytic techniques to EHR data. Acute kidney injury (AKI) is a condition well suited to prediction and risk forecasting; not only does the consensus definition for AKI allow temporal anchoring of events, but no treatments exist once AKI develops, underscoring the importance of early identification and prevention. The Acute Dialysis Quality Initiative (ADQI) convened a group of key opinion leaders and stakeholders to consider how best to approach AKI research and care in the "Big Data" era. This manuscript addresses the core elements of AKI risk prediction and outlines potential pathways and processes. We describe AKI prediction targets, feature selection, model development, and data display
- …