304,306 research outputs found

    FHIR-DHP: A standardized clinical data harmonisation pipeline for scalable AI application deployment

    Get PDF
    Background Increasing digitalisation in the medical domain gives rise to large amounts of healthcare data which has the potential to expand clinical knowledge and transform patient care if leveraged through artificial intelligence (AI). Yet, big data and AI oftentimes cannot unlock their full potential at scale, owing to non-standardised data formats, lack of technical and semantic data interoperability, and limited cooperation between stakeholders in the healthcare system. Despite the existence of standardised data formats for the medical domain, such as Fast Healthcare Interoperability Resources (FHIR), their prevalence and usability for AI remains limited.Objective We developed a data harmonisation pipeline (DHP) for clinical data sets relying on the common FHIR data standard.Methods We validated the performance and usability of our FHIR-DHP with data from the MIMIC IV database including > 40,000 patients admitted to an intensive care unit.Results We present the FHIR-DHP workflow in respect of transformation of “raw” hospital records into a harmonised, AI-friendly data representation. The pipeline consists of five key preprocessing steps: querying of data from hospital database, FHIR mapping, syntactic validation, transfer of harmonised data into the patient-model database and export of data in an AI-friendly format for further medical applications. A detailed example of FHIR-DHP execution was presented for clinical diagnoses records.Conclusions Our approach enables scalable and needs-driven data modelling of large and heterogenous clinical data sets. The FHIR-DHP is a pivotal step towards increasing cooperation, interoperability and quality of patient care in the clinical routine and for medical research

    Identifying And Validating Type 1 And Type 2 Diabetic Cases Using Administrative Date: A Tree-Structured Model

    Get PDF
    Background: Planning, implementing, monitoring, temporal evolution and prognosis differ between type 1 diabetes (T1DM) and type 2 diabetes (T2DM). To date, few administrative diabetes registries have distinguished T1DM from T2DM, reflecting the lack of required differential information and possible recording bias. Objective: Using a classification tree model, we developed a prediction rule to distinguish T1DM from T2DM accurately, using information from a large administrative database.Methods: The Medical Archival Retrieval System (MARS) at the University of Pittsburgh Medical Center from 1/1/2000-9/30/2009 included administrative and clinical data for 209,642 unique diabetic patients aged ≥ 18 years. We identified 10,004 T1DM and 156,712 T2DM patients as probable or possible cases, based on clinical criteria. Classification tree models were fit using TIBCO Spotfire S+ 8.1 (TIBCO Software). We used 10-fold cross-validation to choose model size. We estimated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of T1DM.Results: The main predictors that distinguished T1DM from T2DM include age < 40 vs. ≥ 40 years, ICD-9 codes of T1DM or T2DM diagnosis, oral hypoglycemic agent use, insulin use, and episode(s) of diabetic ketoacidosis diagnosis. History of hypoglycemic coma, duration in the MARS database, in-patient diagnosis of diabetes, and number of complications (including myocardial infarction, coronary artery bypass graft, dialysis, neuropathy, retinopathy, and amputation) are ancillary predictors. The tree-structured model to predict T1DM from probable cases yields sensitivity (99.63%), specificity (99.28%), PPV (89.87%) and NPV (99.71%).Conclusion: Our preliminary predictive rule to distinguish between T1DM and T2DM cases in a large administrative database appears to be promising and needs to be validated. The public health significance is that being able to distinguish between these diabetes subtypes will allow future subtype-specific analyses of cost, morbidity, and mortality. Future work will focus on ascertaining the validity and generalizability of our predictive rule, by conducting a review of medical charts (as an internal validation) and applying the rule to another MARS dataset or other administrative databases (as external validations)

    Swiss Validation of the Enhanced Recovery After Surgery (ERAS) Database.

    Get PDF
    Enhanced recovery after surgery (ERAS) pathways have considerably improved postoperative outcomes and are in use for various types of surgery. The prospective audit system (EIAS) could be a powerful tool for large-scale outcome research but its database has not been validated yet. Swiss ERAS centers were invited to contribute to the validation of the Swiss chapter for colorectal surgery. A monitoring team performed on-site visits by the use of a standardized checklist. Validation criteria were (I) coverage (No. of operated patients within ERAS protocol; target threshold for validation: ≥ 80%), (II) missing data (8 predefined variables; target ≤ 10%), and (III) accuracy (2 predefined variables, target ≥ 80%). These criteria were assessed by comparing EIAS entries with the medical charts of a random sample of patients per center (range 15-20). Out of 18 Swiss ERAS centers, 15 agreed to have onsite monitoring but 13 granted access to the final dataset. ERAS coverage was available in only 7 centers and varied between 76 and 100%. Overall missing data rate was 5.7% and concerned mainly the variables "urinary catheter removal" (16.4%) and "mobilization on day 1" (16%). Accuracy for the length of hospital stay and complications was overall 84.6%. Overall, 5 over 13 centers failed in the validation process for one or several criteria. EIAS was validated in most Swiss ERAS centers. Potential patient selection and missing data remain sources of bias in non-validated centers. Therefore, simplified validation of other centers appears to be mandatory before large-scale use of the EIAS dataset

    Risk Analysis of Clostridiodides Difficile Infections in a Hospital Setting and the Impact of Prior Choice on Predictive Capability

    Get PDF
    Healthcare-associated Clostridiodides Difficile (C. diff.) infections are one of the most common healthcare associated infections in the U.S., leading to thousands of deaths per year. Machine learning algorithms have shown some ability to predict who is most vul- nerable to C. diff. infection utilizing electronic health records obtained soon after admittance, but these models have shown insufficient predictive capability. We extracted data from the electronic medical records provided in the MIMIC-III Clinical Database which contains data from the Beth Israel Deaconess Medical Center between 2001 and 2012, resulting in very large predictor matrices. We aimed to predict which patients would receive a positive test for C. diff. using a Bayesian logistic regression model. We examined the impact of three different priors, a normal, double exponential, and regularized horseshoe prior to understand how prior choice influenced predictive capability and the size of coefficients. We used cross-validation to test the predictive capability of each prior, and compared results between models using ROC and PR curves. Our results show that of the three priors, the regularized horseshoe prior achieves the highest prediction accuracy
    corecore