Predicting prognosis in large cohort of decompensated cirrhosis of liver (DCLD)- a machine learning (ML) approach

Abstract

Background and aims: Onset of decompensation in cirrhosis is associated with poor outcome. The current clinico-biochemical tools have limited accuracy in predicting outcomes reliably. Identifying the predictors with precision model on the big data using artificial intelligence may improve predictability. We aimed to develop a machine learning (ML) based prognostic model for predicting 90 day survival in patients of cirrhosis presenting with decompensation. Method: We analysed electronic medical records retrospectively of hospitalised cirrhosis patients at the ILBS, with a complete 90-day follow-up. Clinical data, laboratory parameters and organ involvement were serially noted. AI-modelling was done after appropriate mining, feature engineering, splitted randomly into train and testsets (20:80). The class imbalance problem was handled by random over-sampling technique, to make balanced 50:50 ratios. After 10- fold cross validation, 3 repetitions and grid search for optimal hyper parameters, the XGB-CV model was chosen. AUC was the primary selection criteria and confusion matrix was used to compare AUCs between AI-models and existing indices; CTP and MELD-score. Results: Total of 6326 patients [mean age 48.2 ± 11.5 years, 84% male, Mean CTP 10.4 ± 2.2 and MELD Na-30.4 ± 11.9, alcohol 49.4%] were included. Ninety day mortality was 29.2%. Acute insult was identified in 80% cases; of which extra-hepatic 49%, hepatic 46% and unknown 5% cases respectively. The XGB-CV model had the best accuracy for prediction of 90 days event in the train set 0.90 (0.90–0.93), validation set 0.80 (0.79–0.81) and for overall dataset 0.80 (0.79– 0.81). The AUC of the XGB-CV model was better than CTP and MELD Na-score by 16% and 15% respectively. The prediction model considered 43 variables; 18 of which predicted the outcome, and 10 maximum contributors are shown in concordance classifier. The most contributors to poor outcome included, index presentation as HE, diagnosis of AD/ACLF/ESLD, PT-INR, serum creatinine, total bilirubin, acute insult etiology, prior decompensation, acute hepatic or extrahepatic insult, leukocyte count and present duration of illness. In the Decision Tree Model, the presence of HE, PT-INR and syndromic diagnosis of AD or ACLF/ESLD was able to stratify the patients into low (22%), intermediate (23–46%) and high risk (\u3e75%) of mortality at 90 days. Conclusion: The AI based current model developed using a large data base of CLD patients presenting with decompensation immensely adds to the current indices of liver disease severity and can stratify patients at admission. Simple ML algorithms using HE and INR besides syndromic presentation, could help treatment decisions and prognostication

    Similar works