4,020 research outputs found

    Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data

    Full text link
    Acute Kidney Injury (AKI), a sudden decline in kidney function, is associated with increased mortality, morbidity, length of stay, and hospital cost. Since AKI is sometimes preventable, there is great interest in prediction. Most existing studies consider all patients and therefore restrict to features available in the first hours of hospitalization. Here, the focus is instead on rehospitalized patients, a cohort in which rich longitudinal features from prior hospitalizations can be analyzed. Our objective is to provide a risk score directly at hospital re-entry. Gradient boosting, penalized logistic regression (with and without stability selection), and a recurrent neural network are trained on two years of adult inpatient EHR data (3,387 attributes for 34,505 patients who generated 90,013 training samples with 5,618 cases and 84,395 controls). Predictions are internally evaluated with 50 iterations of 5-fold grouped cross-validation with special emphasis on calibration, an analysis of which is performed at the patient as well as hospitalization level. Error is assessed with respect to diagnosis, race, age, gender, AKI identification method, and hospital utilization. In an additional experiment, the regularization penalty is severely increased to induce parsimony and interpretability. Predictors identified for rehospitalized patients are also reported with a special analysis of medications that might be modifiable risk factors. Insights from this study might be used to construct a predictive tool for AKI in rehospitalized patients. An accurate estimate of AKI risk at hospital entry might serve as a prior for an admitting provider or another predictive algorithm.Comment: In revisio

    Visual Analytics of Electronic Health Records with a focus on Acute Kidney Injury

    Get PDF
    The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports the needs of healthcare providers. Another reason is the information overload, which makes healthcare providers often misunderstand, misinterpret, ignore, or overlook vital data. The emergence of a type of computational system known as visual analytics (VA), has the potential to reduce the complexity of EHR data by combining advanced analytics techniques with interactive visualizations to analyze, synthesize, and facilitate high-level activities while allowing users to get more involved in a discourse with the data. The purpose of this research is to demonstrate the use of sophisticated visual analytics systems to solve various EHR-related research problems. This dissertation includes a framework by which we identify gaps in existing EHR-based systems and conceptualize the data-driven activities and tasks of our proposed systems. Two novel VA systems (VISA_M3R3 and VALENCIA) and two studies are designed to bridge the gaps. VISA_M3R3 incorporates multiple regression, frequent itemset mining, and interactive visualization to assist users in the identification of nephrotoxic medications. Another proposed system, VALENCIA, brings a wide range of dimension reduction and cluster analysis techniques to analyze high-dimensional EHRs, integrate them seamlessly, and make them accessible through interactive visualizations. The studies are conducted to develop prediction models to classify patients who are at risk of developing acute kidney injury (AKI) and identify AKI-associated medication and medication combinations using EHRs. Through healthcare administrative datasets stored at the ICES-KDT (Kidney Dialysis and Transplantation program), London, Ontario, we have demonstrated how our proposed systems and prediction models can be used to solve real-world problems

    Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records

    Get PDF
    Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks

    Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding

    Get PDF
    Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation

    Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

    Get PDF
    Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations. This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing rates using multiple subsets with lower missing rates. The second method introduces the integration of medical knowledge graphs and double attention mechanism with the long short-term memory (LSTM) model to enhance interpretability by providing knowledge-based model interpretation. The third method develops an LSTM variant that integrates medical knowledge graphs and additional time-aware gates to handle multi-variable temporal missing issues and interpretability concerns. Finally, a transformer-based model is proposed to learn unbiased and fair representations of diverse subpopulations using domain classifiers and three attention mechanisms

    Prognostic prediction models using Self-Attention for ICU patients developing acute kidney injury

    Get PDF
    Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2022The general growth and improved accessibility to electronic health records demands an identical level of progress in terms of the research community regarding clinical models. The usage of machine learning techniques is key to this development, and so they are increasingly being used in large medical databases with the purpose of creating solutions that work for specified patients, no matter the task or the disease. Acute kidney injury (AKI) is a broad disease defined by abrupt changes in renal function. AKI has a high morbidity and mortality, with an increased focus on critically ill patients. The main goal of this thesis is to study the development of AKI within a patient’s stay in the intensive care unit (ICU). Data from the MIMIC-III database was used to collect information regarding the patients. After a detailed exclusion criteria, those were evaluated in terms of AKI stages, with the purpose of predicting the next value of AKI stage one hour after the sequence of information fed to the model. This can suggest the capacity of the model at predicting the aggravation of a patient’s AKI condition. The sequences used have hourly information for every feature, and were used sequences of 6h, 12h and 24h length. Self-attention mechanisms were used to make the predictions, using an adaptation for multi-variate time series built from the successfully used models on natural language processing (NLP) tasks. The predictions on this work were made for two variations of the KDIGO classification system: one where only the serum creatinine (SCr) criteria was taken into account to determine the patient’s AKI stage, and other where both SCr and urine output (UO) were considered. While most works addressing AKI only tend to use SCr values to determine the patient’s AKI condition, the results were compared using both approaches and were better when using both SCr and UO. For those experiments, the model achieved up to 68.05% accuracy predicting an episode of AKI, compared to the 66.67% accuracy achieved using only SCr values, which outperformed state-of-the-art results for both cases. Feature importance was also used for each dataset associated with the two variations of KDIGO classification system to identify what were the most important features. Furthermore, final results were compared when using all features versus only using the most 10 important ones

    Integrating Real Time Data to Improve Outcomes in Acute Kidney Injury

    Get PDF
    Critically ill patients with acute kidney injury requiring renal replacement therapy have a poor prognosis. Despite well-known factors, which contribute to outcomes, including dose delivery, patients frequently miss the target dose and volume removal. One major barrier to effective care of these patients is the traditional dissociation of dialysis device data from other clinical information systems, notably the electronic health record (EHR). This lack of integration and the resulting manual documentation leads to errors and biases in documentation and missed opportunities to intervene in a timely fashion. This review summarizes the technological advancements facilitating direct connection of dialysis devices to EHRs. This connection facilitates automated data capture of many variables - including delivered dose, ultrafiltration rate and pressure measurements - which in turn can be leveraged for data mining, quality improvement and real-time targeted therapy adjustments. These interventions hold the promise to significantly improve outcomes for this patient population

    Utilizing electronic health records to predict acute kidney injury risk and outcomes: Workgroup statements from the 15<sup>th</sup> ADQI Consensus Conference

    Get PDF
    The data contained within the electronic health record (EHR) is "big" from the standpoint of volume, velocity, and variety. These circumstances and the pervasive trend towards EHR adoption have sparked interest in applying big data predictive analytic techniques to EHR data. Acute kidney injury (AKI) is a condition well suited to prediction and risk forecasting; not only does the consensus definition for AKI allow temporal anchoring of events, but no treatments exist once AKI develops, underscoring the importance of early identification and prevention. The Acute Dialysis Quality Initiative (ADQI) convened a group of key opinion leaders and stakeholders to consider how best to approach AKI research and care in the "Big Data" era. This manuscript addresses the core elements of AKI risk prediction and outlines potential pathways and processes. We describe AKI prediction targets, feature selection, model development, and data display
    • …
    corecore