6 research outputs found

    Seasonally adjusted laboratory reference intervals to improve the performance of machine learning models for classification of cardiovascular diseases

    No full text
    Abstract Background Variation in laboratory healthcare data due to seasonal changes is a widely accepted phenomenon. Seasonal variation is generally not systematically accounted for in healthcare settings. This study applies a newly developed adjustment method for seasonal variation to analyze the effect seasonality has on machine learning model classification of diagnoses. Methods Machine learning methods were trained and tested on ~ 22 million unique records from ~ 575,000 unique patients admitted to Danish hospitals. Four machine learning models (adaBoost, decision tree, neural net, and random forest) classifying 35 diseases of the circulatory system (ICD-10 diagnosis codes, chapter IX) were run before and after seasonal adjustment of 23 laboratory reference intervals (RIs). The effect of the adjustment was benchmarked via its contribution to machine learning models trained using hyperparameter optimization and assessed quantitatively using performance metrics (AUROC and AUPRC). Results Seasonally adjusted RIs significantly improved cardiovascular disease classification in 24 of the 35 tested cases when using neural net models. Features with the highest average feature importance (via SHAP explainability) across all disease models were sex, C- reactive protein, and estimated glomerular filtration. Classification of diseases of the vessels, such as thrombotic diseases and other atherosclerotic diseases consistently improved after seasonal adjustment. Conclusions As data volumes increase and data-driven methods are becoming more advanced, it is essential to improve data quality at the pre-processing level. This study presents a method that makes it feasible to introduce seasonally adjusted RIs into the clinical research space in any disease domain. Seasonally adjusted RIs generally improve diagnoses classification and thus, ought to be considered and adjusted for in clinical decision support methods

    Temporal patterns of multi-morbidity in 570157 ischemic heart disease patients:a nationwide cohort study

    No full text
    BACKGROUND: Patients diagnosed with ischemic heart disease (IHD) are becoming increasingly multi-morbid, and studies designed to analyze the full spectrum are few. METHODS: Disease trajectories, defined as time-ordered series of diagnoses, were used to study the temporality of multi-morbidity. The main data source was The Danish National Patient Register (NPR) comprising 7,179,538 individuals in the period 1994–2018. Patients with a diagnosis code for IHD were included. Relative risks were used to quantify the strength of the association between diagnostic co-occurrences comprised of two diagnoses that were overrepresented in the same patients. Multiple linear regression models were then fitted to test for temporal associations among the diagnostic co-occurrences, termed length two disease trajectories. Length two disease trajectories were then used as basis for constructing disease trajectories of three diagnoses. RESULTS: In a cohort of 570,157 IHD disease patients, we identified 1447 length two disease trajectories and 4729 significant length three disease trajectories. These included 459 distinct diagnoses. Disease trajectories were dominated by chronic diseases and not by common, acute diseases such as pneumonia. The temporal association of atrial fibrillation (AF) and IHD differed in different IHD subpopulations. We found an association between osteoarthritis (OA) and heart failure (HF) among patients diagnosed with OA, IHD, and then HF only. CONCLUSIONS: The sequence of diagnoses is important in characterization of multi-morbidity in IHD patients as the disease trajectories. The study provides evidence that the timing of AF in IHD marks distinct IHD subpopulations; and secondly that the association between osteoarthritis and heart failure is dependent on IHD. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12933-022-01527-3

    Cohort profile:Copenhagen Hospital Biobank - Cardiovascular Disease Cohort (CHB-CVDC): Construction of a large-scale genetic cohort to facilitate a better understanding of heart diseases

    No full text
    PURPOSE: The aim of Copenhagen Hospital Biobank-Cardiovascular Disease Cohort (CHB-CVDC) is to establish a cohort that can accelerate our understanding of CVD initiation and progression by jointly studying genetics, diagnoses, treatments and risk factors. PARTICIPANTS: The CHB-CVDC is a large genomic cohort of patients with CVD. CHB-CVDC currently includes 96 308 patients. The cohort is part of CHB initiated in 2009 in the Capital Region of Denmark. CHB is continuously growing with ~40 000 samples/year. Patients in CHB were included in CHB-CVDC if they were above 18 years of age and assigned at least one cardiovascular diagnosis. Additionally, up-to 110 000 blood donors can be analysed jointly with CHB-CVDC. Linkage with the Danish National Health Registries, Electronic Patient Records, and Clinical Quality Databases allow up-to 41 years of medical history. All individuals are genotyped using the Infinium Global Screening Array from Illumina and imputed using a reference panel consisting of whole-genome sequence data from 8429 Danes along with 7146 samples from North-Western Europe. Currently, 39 539 of the patients are deceased. FINDINGS TO DATE: Here, we demonstrate the utility of the cohort by showing concordant effects between known variants and selected CVDs, that is, >93% concordance for coronary artery disease, atrial fibrillation, heart failure and cholesterol measurements and 85% concordance for hypertension. Furthermore, we evaluated multiple study designs and the validity of using Danish blood donors as part of CHB-CVDC. Lastly, CHB-CVDC has already made major contributions to studies of sick sinus syndrome and the role of phytosterols in development of atherosclerosis. FUTURE PLANS: In addition to genetics, electronic patient records, national socioeconomic and health registries extensively characterise each patient in CHB-CVDC and provides a promising framework for improved understanding of risk and protective variants. We aim to include other measurable biomarkers for example, proteins in CHB-CVDC making it a platform for multiomics cardiovascular studies
    corecore