40 research outputs found

    Handling Overlapping Asymmetric Data Sets—A Twice Penalized P-Spline Approach

    Get PDF
    \ua9 2024 by the authors.Aims: Overlapping asymmetric data sets are where a large cohort of observations have a small amount of information recorded, and within this group there exists a smaller cohort which have extensive further information available. Missing imputation is unwise if cohort size differs substantially; therefore, we aim to develop a way of modelling the smaller cohort whilst considering the larger. Methods: Through considering traditionally once penalized P-Spline approximations, we create a second penalty term through observing discrepancies in the marginal value of covariates that exist in both cohorts. Our now twice penalized P-Spline is designed to firstly prevent over/under-fitting of the smaller cohort and secondly to consider the larger cohort. Results: Through a series of data simulations, penalty parameter tunings, and model adaptations, our twice penalized model offers up to a 58% and 46% improvement in model fit upon a continuous and binary response, respectively, against existing B-Spline and once penalized P-Spline methods. Applying our model to an individual’s risk of developing steatohepatitis, we report an over 65% improvement over existing methods. Conclusions: We propose a twice penalized P-Spline method which can vastly improve the model fit of overlapping asymmetric data sets upon a common predictive endpoint, without the need for missing data imputation

    Handling Overlapping Asymmetric Data Sets—A Twice Penalized P-Spline Approach

    Get PDF
    Aims: Overlapping asymmetric data sets are where a large cohort of observations have a small amount of information recorded, and within this group there exists a smaller cohort which have extensive further information available. Missing imputation is unwise if cohort size differs substantially; therefore, we aim to develop a way of modelling the smaller cohort whilst considering the larger. Methods: Through considering traditionally once penalized P-Spline approximations, we create a second penalty term through observing discrepancies in the marginal value of covariates that exist in both cohorts. Our now twice penalized P-Spline is designed to firstly prevent over/under-fitting of the smaller cohort and secondly to consider the larger cohort. Results: Through a series of data simulations, penalty parameter tunings, and model adaptations, our twice penalized model offers up to a 58% and 46% improvement in model fit upon a continuous and binary response, respectively, against existing B-Spline and once penalized P-Spline methods. Applying our model to an individual’s risk of developing steatohepatitis, we report an over 65% improvement over existing methods. Conclusions: We propose a twice penalized P-Spline method which can vastly improve the model fit of overlapping asymmetric data sets upon a common predictive endpoint, without the need for missing data imputation

    Handling Overlapping Asymmetric Datasets -- A Twice Penalized P-Spline Approach

    Full text link
    Overlapping asymmetric datasets are common in data science and pose questions of how they can be incorporated together into a predictive analysis. In healthcare datasets there is often a small amount of information that is available for a larger number of patients such as an electronic health record, however a small number of patients may have had extensive further testing. Common solutions such as missing imputation can often be unwise if the smaller cohort is significantly different in scale to the larger sample, therefore the aim of this research is to develop a new method which can model the smaller cohort against a particular response, whilst considering the larger cohort also. Motivated by non-parametric models, and specifically flexible smoothing techniques via generalized additive models, we model a twice penalized P-Spline approximation method to firstly prevent over/under-fitting of the smaller cohort and secondly to consider the larger cohort. This second penalty is created through discrepancies in the marginal value of covariates that exist in both the smaller and larger cohorts. Through data simulations, parameter tunings and model adaptations to consider a continuous and binary response, we find our twice penalized approach offers an enhanced fit over a linear B-Spline and once penalized P-Spline approximation. Applying to a real-life dataset relating to a person's risk of developing Non-Alcoholic Steatohepatitis, we see an improved model fit performance of over 65%. Areas for future work within this space include adapting our method to not require dimensionality reduction and also consider parametric modelling methods. However, to our knowledge this is the first work to propose additional marginal penalties in a flexible regression of which we can report a vastly improved model fit that is able to consider asymmetric datasets, without the need for missing data imputation.Comment: 52 pages, 17 figures, 8 tables, 34 reference

    Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information

    Get PDF
    \ua9 2024 McTeer et al.Aims Metabolic dysfunction Associated Steatotic Liver Disease (MASLD) outcomes such as MASH (metabolic dysfunction associated steatohepatitis), fibrosis and cirrhosis are ordinarily determined by resource-intensive and invasive biopsies. We aim to show that routine clinical tests offer sufficient information to predict these endpoints. Methods Using the LITMUS Metacohort derived from the European NAFLD Registry, the largest MASLD dataset in Europe, we create three combinations of features which vary in degree of procurement including a 19-variable feature set that are attained through a routine clinical appointment or blood test. This data was used to train predictive models using supervised machine learning (ML) algorithm XGBoost, alongside missing imputation technique MICE and class balancing algorithm SMOTE. Shapley Additive exPlanations (SHAP) were added to determine relative importance for each clinical variable. Results Analysing nine biopsy-derived MASLD outcomes of cohort size ranging between 5385 and 6673 subjects, we were able to predict individuals at training set AUCs ranging from 0.719-0.994, including classifying individuals who are At-Risk MASH at an AUC = 0.899. Using two further feature combinations of 26-variables and 35-variables, which included composite scores known to be good indicators for MASLD endpoints and advanced specialist tests, we found predictive performance did not sufficiently improve. We are also able to present local and global explanations for each ML model, offering clinicians interpretability without the expense of worsening predictive performance. Conclusions This study developed a series of ML models of accuracy ranging from 71.9—99.4% using only easily extractable and readily available information in predicting MASLD outcomes which are usually determined through highly invasive means

    Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information

    Get PDF
    Aims: Metabolic dysfunction Associated Steatotic Liver Disease (MASLD) outcomes such as MASH (metabolic dysfunction associated steatohepatitis), fibrosis and cirrhosis are ordinarily determined by resource-intensive and invasive biopsies. We aim to show that routine clinical tests offer sufficient information to predict these endpoints. Methods: Using the LITMUS Metacohort derived from the European NAFLD Registry, the largest MASLD dataset in Europe, we create three combinations of features which vary in degree of procurement including a 19-variable feature set that are attained through a routine clinical appointment or blood test. This data was used to train predictive models using supervised machine learning (ML) algorithm XGBoost, alongside missing imputation technique MICE and class balancing algorithm SMOTE. Shapley Additive exPlanations (SHAP) were added to determine relative importance for each clinical variable. Results: Analysing nine biopsy-derived MASLD outcomes of cohort size ranging between 5385 and 6673 subjects, we were able to predict individuals at training set AUCs ranging from 0.719-0.994, including classifying individuals who are At-Risk MASH at an AUC = 0.899. Using two further feature combinations of 26-variables and 35-variables, which included composite scores known to be good indicators for MASLD endpoints and advanced specialist tests, we found predictive performance did not sufficiently improve. We are also able to present local and global explanations for each ML model, offering clinicians interpretability without the expense of worsening predictive performance. Conclusions: This study developed a series of ML models of accuracy ranging from 71.9—99.4% using only easily extractable and readily available information in predicting MASLD outcomes which are usually determined through highly invasive means

    Performance of non-invasive tests and histology for the prediction of clinical outcomes in patients with non-alcoholic fatty liver disease: an individual participant data meta-analysis

    Get PDF
    BackgroundHistologically assessed liver fibrosis stage has prognostic significance in patients with non-alcoholic fatty liver disease (NAFLD) and is accepted as a surrogate endpoint in clinical trials for non-cirrhotic NAFLD. Our aim was to compare the prognostic performance of non-invasive tests with liver histology in patients with NAFLD.MethodsThis was an individual participant data meta-analysis of the prognostic performance of histologically assessed fibrosis stage (F0–4), liver stiffness measured by vibration-controlled transient elastography (LSM-VCTE), fibrosis-4 index (FIB-4), and NAFLD fibrosis score (NFS) in patients with NAFLD. The literature was searched for a previously published systematic review on the diagnostic accuracy of imaging and simple non-invasive tests and updated to Jan 12, 2022 for this study. Studies were identified through PubMed/MEDLINE, EMBASE, and CENTRAL, and authors were contacted for individual participant data, including outcome data, with a minimum of 12 months of follow-up. The primary outcome was a composite endpoint of all-cause mortality, hepatocellular carcinoma, liver transplantation, or cirrhosis complications (ie, ascites, variceal bleeding, hepatic encephalopathy, or progression to a MELD score ≥15). We calculated aggregated survival curves for trichotomised groups and compared them using stratified log-rank tests (histology: F0–2 vs F3 vs F4; LSM: 2·67; NFS: 0·676), calculated areas under the time-dependent receiver operating characteristic curves (tAUC), and performed Cox proportional-hazards regression to adjust for confounding. This study was registered with PROSPERO, CRD42022312226.FindingsOf 65 eligible studies, we included data on 2518 patients with biopsy-proven NAFLD from 25 studies (1126 [44·7%] were female, median age was 54 years [IQR 44–63), and 1161 [46·1%] had type 2 diabetes). After a median follow-up of 57 months [IQR 33–91], the composite endpoint was observed in 145 (5·8%) patients. Stratified log-rank tests showed significant differences between the trichotomised patient groups (p<0·0001 for all comparisons). The tAUC at 5 years were 0·72 (95% CI 0·62–0·81) for histology, 0·76 (0·70–0·83) for LSM-VCTE, 0·74 (0·64–0·82) for FIB-4, and 0·70 (0·63–0·80) for NFS. All index tests were significant predictors of the primary outcome after adjustment for confounders in the Cox regression.InterpretationSimple non-invasive tests performed as well as histologically assessed fibrosis in predicting clinical outcomes in patients with NAFLD and could be considered as alternatives to liver biopsy in some cases

    A Multicenter Investigation into the Occurrence of High-Pressure Excursions

    No full text
    The occurrence of sudden increases in premembrane pressures and membrane pressure differentials has drawn considerable attention and debate in the perfusion community. Several terms have been applied to this phenomenon, but the term that best describes this event is “high-pressure excursion” (HPE). The exact causes of HPE are still uncertain, but nonetheless widely speculated. However, their increased appearance seems to be very closely related to the removal/absence of human serum albumin from priming solutions. To investigate the reasons why HPE occur in some cardiopulmonary bypass cases, we present our findings in a multicenter, retrospective analysis of 2696 cardiopulmonary bypass cases. Of the 31 cases of HPE that were documented from the analysis, 60 preoperative and perioperative variables were gathered from the participating tertiary care centers. Our findings indicate that these pressure excursions had an occurrence of 1.14% in the three centers involved with this analysis. The largest occurrence of HPE tended to be in male (87.1%) coronary artery disease patients (96.8%) during the presence of the IV anesthetic Diprivan (74.2%). In conclusions, HPE are not perfusate related because it occurred in the presence of three perfusate combinations. They also do not seem to be oxygenator related or exclusive to hypothermic temperatures or heat exchangers

    Evaluation of Mimesys® Phosphorylcholine (PC)-Coated Oxygenators during Cardiopulmonary Bypass in Adults

    No full text
    A new generation of coating extracorporeal circuitry with biocompatible polymers has entered the North American perfusion market. This new biomimetic coating process uses synthetic phosphorylcholine (PC) containing polymers to bond covalently to the surface of the Sorin Monolyth® oxygenator, under the brand name of Mimesys®. In part one of a three-part investigation, 160 Mimesys®-coated oxygenators were randomly evaluated against 36 uncoated oxygenators for blood flow, hemodynamic resistance, and pressure differentials. In part two, retrospective analysis of platelet data collected in this study was compared with platelet data collected from a previous investigation using uncoated Monolyth oxygenators with albumin and crystalloid perfusates. Part three examined the risk-adjusted oxygenators, compared with 71 case-matched patients treated with uncoated oxygenators. There was no difference found in the Mimesys-coated group, when compared to the control group, with regard to pressure differentials or hemodynamic resistance. However, we conclude that platelet protection with PC-coated Monolyth’s using crystalloid perfusates, was similar to platelet protection with albumin perfusates, and significantly better than uncoated Monolyths® using crystalloid perfusates
    corecore