1,699 research outputs found

    Prognostic prediction models using Self-Attention for ICU patients developing acute kidney injury

    Get PDF
    Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2022The general growth and improved accessibility to electronic health records demands an identical level of progress in terms of the research community regarding clinical models. The usage of machine learning techniques is key to this development, and so they are increasingly being used in large medical databases with the purpose of creating solutions that work for specified patients, no matter the task or the disease. Acute kidney injury (AKI) is a broad disease defined by abrupt changes in renal function. AKI has a high morbidity and mortality, with an increased focus on critically ill patients. The main goal of this thesis is to study the development of AKI within a patient’s stay in the intensive care unit (ICU). Data from the MIMIC-III database was used to collect information regarding the patients. After a detailed exclusion criteria, those were evaluated in terms of AKI stages, with the purpose of predicting the next value of AKI stage one hour after the sequence of information fed to the model. This can suggest the capacity of the model at predicting the aggravation of a patient’s AKI condition. The sequences used have hourly information for every feature, and were used sequences of 6h, 12h and 24h length. Self-attention mechanisms were used to make the predictions, using an adaptation for multi-variate time series built from the successfully used models on natural language processing (NLP) tasks. The predictions on this work were made for two variations of the KDIGO classification system: one where only the serum creatinine (SCr) criteria was taken into account to determine the patient’s AKI stage, and other where both SCr and urine output (UO) were considered. While most works addressing AKI only tend to use SCr values to determine the patient’s AKI condition, the results were compared using both approaches and were better when using both SCr and UO. For those experiments, the model achieved up to 68.05% accuracy predicting an episode of AKI, compared to the 66.67% accuracy achieved using only SCr values, which outperformed state-of-the-art results for both cases. Feature importance was also used for each dataset associated with the two variations of KDIGO classification system to identify what were the most important features. Furthermore, final results were compared when using all features versus only using the most 10 important ones

    Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy

    Get PDF
    Abstract Background Previous scoring models such as the Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) and the Sequential Organ Failure Assessment (SOFA) scoring systems do not adequately predict mortality of patients undergoing continuous renal replacement therapy (CRRT) for severe acute kidney injury. Accordingly, the present study applies machine learning algorithms to improve prediction accuracy for this patient subset. Methods We randomly divided a total of 1571 adult patients who started CRRT for acute kidney injury into training (70%, n = 1094) and test (30%, n = 477) sets. The primary output consisted of the probability of mortality during admission to the intensive care unit (ICU) or hospital. We compared the area under the receiver operating characteristic curves (AUCs) of several machine learning algorithms with that of the APACHE II, SOFA, and the new abbreviated mortality scoring system for acute kidney injury with CRRT (MOSAIC model) results. Results For the ICU mortality, the random forest model showed the highest AUC (0.784 [0.744–0.825]), and the artificial neural network and extreme gradient boost models demonstrated the next best results (0.776 [0.735–0.818]). The AUC of the random forest model was higher than 0.611 (0.583–0.640), 0.677 (0.651–0.703), and 0.722 (0.677–0.767), as achieved by APACHE II, SOFA, and MOSAIC, respectively. The machine learning models also predicted in-hospital mortality better than APACHE II, SOFA, and MOSAIC. Conclusion Machine learning algorithms increase the accuracy of mortality prediction for patients undergoing CRRT for acute kidney injury compared with previous scoring models

    Venous thromboembolism in COVID-19 patients and prediction model: a multicenter cohort study

    Get PDF
    BACKGROUND: Patients with COVID-19 infection are commonly reported to have an increased risk of venous thrombosis. The choice of anti-thrombotic agents and doses are currently being studied in randomized controlled trials and retrospective studies. There exists a need for individualized risk stratification of venous thromboembolism (VTE) to assist clinicians in decision-making on anticoagulation. We sought to identify the risk factors of VTE in COVID-19 patients, which could help physicians in the prevention, early identification, and management of VTE in hospitalized COVID-19 patients and improve clinical outcomes in these patients. METHOD: This is a multicenter, retrospective database of four main health systems in Southeast Michigan, United States. We compiled comprehensive data for adult COVID-19 patients who were admitted between 1st March 2020 and 31st December 2020. Four models, including the random forest, multiple logistic regression, multilinear regression, and decision trees, were built on the primary outcome of in-hospital acute deep vein thrombosis (DVT) and pulmonary embolism (PE) and tested for performance. The study also reported hospital length of stay (LOS) and intensive care unit (ICU) LOS in the VTE and the non-VTE patients. Four models were assessed using the area under the receiver operating characteristic curve and confusion matrix. RESULTS: The cohort included 3531 admissions, 3526 had discharge diagnoses, and 6.68% of patients developed acute VTE (N = 236). VTE group had a longer hospital and ICU LOS than the non-VTE group (hospital LOS 12.2 days vs. 8.8 days, p \u3c 0.001; ICU LOS 3.8 days vs. 1.9 days, p \u3c 0.001). 9.8% of patients in the VTE group required more advanced oxygen support, compared to 2.7% of patients in the non-VTE group (p \u3c 0.001). Among all four models, the random forest model had the best performance. The model suggested that blood pressure, electrolytes, renal function, hepatic enzymes, and inflammatory markers were predictors for in-hospital VTE in COVID-19 patients. CONCLUSIONS: Patients with COVID-19 have a high risk for VTE, and patients who developed VTE had a prolonged hospital and ICU stay. This random forest prediction model for VTE in COVID-19 patients identifies predictors which could aid physicians in making a clinical judgment on empirical dosages of anticoagulation

    Prediction of acute kidney injury using the Electronic Medical Records of a pediatric cardiac intensive care unit

    Get PDF
    Acute Kidney Injury (AKI) is a frequent complication in hospitalized patients significantly associated with mortality, length of stay, and healthcare cost. Management of AKI presents an important challenge and clinicians may be helped by robust prediction models for risk evaluation, foster prevention, and recognition. The advances in clinical informatics and the increasing availability of electronic medical records (EMR) have favored the development of predictive models of risk estimation in AKI. In this dissertation, we analyze the problem of predicting the AKI stage during the patient’s stay in the intensive care unit using retrospectively the Electronic medical records (EMRs) recently introduced in the Pediatric Intensive Care Unit (PCICU) of "Ospedale Pediatrico Bambino Gesù". After the initial phase of data selection, extraction, and management of missing data, we develop a random forest (RF) classification model including a variable selection step with the aim of predicting the stage of AKI 48 hours in advance in both binary and multiclass cases. The performances obtained in terms of Area under the ROC Curve (AUC-ROC) for binary cases and accuracy for multiclass cases are always very good compared with other recent attempts in the literature. The list of the most important variables obtained in the various classifications highlights the importance of some of the expected variables (such as creatinine) reported in other studies in the literature but also the presence of variables that are specific to pediatric patients under examination (such as PIM3). Moreover, we develop other classifications using the Generalized Additive Models (GAMS) and Bayesian network (BN) models that have the benefit of offering a more interpretable approach. Although these results are inferior to the RF, they are comparable with many outcomes reported in the literature. The plot obtained with GAMs and the structure of the directed acyclic graph (DAG) achieved with BN are consistent with a possible medical explanation and would present further interpretation hints for the doctors about the onset of AKI. Finally, we observe that all implemented models confirm the possibility of making an accurate prediction of the AKI stage using the PCICU. These models can be potentially included in a web interface and, in perspective, be integrated into the EMR of PCICU. This tool would allow the doctors to predict prospectively the patient’s stage of AKI and evaluate how to intervene if necessary. In order to proceed with this, it would be necessary for the future to implement the export of a larger dataset adding new data acquired in the meantime in PCICU

    Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

    Get PDF
    Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations. This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing rates using multiple subsets with lower missing rates. The second method introduces the integration of medical knowledge graphs and double attention mechanism with the long short-term memory (LSTM) model to enhance interpretability by providing knowledge-based model interpretation. The third method develops an LSTM variant that integrates medical knowledge graphs and additional time-aware gates to handle multi-variable temporal missing issues and interpretability concerns. Finally, a transformer-based model is proposed to learn unbiased and fair representations of diverse subpopulations using domain classifiers and three attention mechanisms
    corecore