151 research outputs found
A Comparison of Intensive Care Unit Mortality Prediction Models through the Use of Data Mining Techniques
OBJECTIVES: The intensive care environment generates a wealth of critical care data suited to developing a well-calibrated prediction tool. This study was done to develop an intensive care unit (ICU) mortality prediction model built on University of Kentucky Hospital (UKH)\u27s data and to assess whether the performance of various data mining techniques, such as the artificial neural network (ANN), support vector machine (SVM) and decision trees (DT), outperform the conventional logistic regression (LR) statistical model.
METHODS: The models were built on ICU data collected regarding 38,474 admissions to the UKH between January 1998 and September 2007. The first 24 hours of the ICU admission data were used, including patient demographics, admission information, physiology data, chronic health items, and outcome information.
RESULTS: Only 15 study variables were identified as significant for inclusion in the model development. The DT algorithm slightly outperformed (AUC, 0.892) the other data mining techniques, followed by the ANN (AUC, 0.874), and SVM (AUC, 0.876), compared to that of the APACHE III performance (AUC, 0.871).
CONCLUSIONS: With fewer variables needed, the machine learning algorithms that we developed were proven to be as good as the conventional APACHE III prediction
Recommended from our members
Comparison of First-Line Dual Combination Treatments in Hypertension: Real-World Evidence from Multinational Heterogeneous Cohorts.
Background and objectives: 2018 ESC/ESH Hypertension guideline recommends 2-drug combination as initial anti-hypertensive therapy. However, real-world evidence for effectiveness of recommended regimens remains limited. We aimed to compare the effectiveness of first-line anti-hypertensive treatment combining 2 out of the following classes: angiotensin-converting enzyme (ACE) inhibitors/angiotensin-receptor blocker (A), calcium channel blocker (C), and thiazide-type diuretics (D).Methods: Treatment-naïve hypertensive adults without cardiovascular disease (CVD) who initiated dual anti-hypertensive medications were identified in 5 databases from US and Korea. The patients were matched for each comparison set by large-scale propensity score matching. Primary endpoint was all-cause mortality. Myocardial infarction, heart failure, stroke, and major adverse cardiac and cerebrovascular events as a composite outcome comprised the secondary measure.Results: A total of 987,983 patients met the eligibility criteria. After matching, 222,686, 32,344, and 38,513 patients were allocated to A+C vs. A+D, C+D vs. A+C, and C+D vs. A+D comparison, respectively. There was no significant difference in the mortality during total of 1,806,077 person-years: A+C vs. A+D (hazard ratio [HR], 1.08; 95% confidence interval [CI], 0.97-1.20; p=0.127), C+D vs. A+C (HR, 0.93; 95% CI, 0.87-1.01; p=0.067), and C+D vs. A+D (HR, 1.18; 95% CI, 0.95-1.47; p=0.104). A+C was associated with a slightly higher risk of heart failure (HR, 1.09; 95% CI, 1.01-1.18; p=0.040) and stroke (HR, 1.08; 95% CI, 1.01-1.17; p=0.040) than A+D.Conclusions: There was no significant difference in mortality among A+C, A+D, and C+D combination treatment in patients without previous CVD. This finding was consistent across multi-national heterogeneous cohorts in real-world practice
Extending Achilles Heel Data Quality Tool with New Rules Informed by Multi-Site Data Quality Comparison
Large healthcare datasets of Electronic Health Record data became indispensable in clinical research. Data quality in such datasets recently became a focus of many distributed research networks. Despite the fact that data quality is specific to a given research question, many existing data quality platform prove that general data quality assessment on dataset level (given a spectrum of research questions) is possible and highly requested by researchers. We present comparison of 12 datasets and extension of Achilles Heel data quality software tool with new rules and data characterization measures
A standardized analytics pipeline for reliable and rapid development and validation of prediction models using observational health data
Background and objective: As a response to the ongoing COVID-19 pandemic, several prediction models in the existing literature were rapidly developed, with the aim of providing evidence-based guidance. However, none of these COVID-19 prediction models have been found to be reliable. Models are commonly assessed to have a risk of bias, often due to insufficient reporting, use of non-representative data, and lack of large-scale external validation. In this paper, we present the Observational Health Data Sciences and Informatics (OHDSI) analytics pipeline for patient-level prediction modeling as a standardized approach for rapid yet reliable development and validation of prediction models. We demonstrate how our analytics pipeline and open-source software tools can be used to answer important prediction questions while limiting potential causes of bias (e.g., by validating phenotypes, specifying the target population, performing large-scale external validation, and publicly providing all analytical source code). Methods: We show step-by-step how to implement the analytics pipeline for the question: ‘In patients hospitalized with COVID-19, what is the risk of death 0 to 30 days after hospitalization?’. We develop models using six different machine learning methods in a USA claims database containing over 20,000 COVID-19 hospitalizations and externally validate the models using data containing over 45,000 COVID-19 hospitalizations from South Korea, Spain, and the USA. Results: Our open-source software tools enabled us to efficiently go end-to-end from problem design to reliable Model Development and evaluation. When predicting death in patients hospitalized with COVID-19, AdaBoost, random forest, gradient boosting machine, and decision tree yielded similar or lower internal and external validation discrimination performance compared to L1-regularized logistic regression, whereas the MLP neural network consistently resulted in lower discrimination. L1-regularized logistic regression models were well calibrated. Conclusion: Our results show that following the OHDSI analytics pipeline for patient-level prediction modelling can enable the rapid development towards reliable prediction models. The OHDSI software tools and pipeline are open source and available to researchers from all around the world.</p
A Family of H723R Mutation for SLC26A4 Associated with Enlarged Vestibular Aqueduct Syndrome
Recessive mutations of the SLC26A4 (PDS) gene on chromosome 7q31 can cause sensorineural deafness with goiter (Pendred syndrome, OMIM 274600) or NSRD with goiter (at the DFNB4 locus, OMIM 600791). H723R (2168A>G) is the most commonly reported SLC26A4 mutations in Korean and Japanese and known as founder mutation. We recently experienced one patient with enlarged vestibular aqueduct syndrome. The genetic study showed H723R homozygous in the proband and H723R heterozygous mutation in his family members. The identification of a disease-causing mutation can be used to establish a genotypic diagnosis and provide important information to both families and their physicians
Privacy-Preserving Federated Model Predicting Bipolar Transition in Patients With Depression:Prediction Model Development Study
BACKGROUND: Mood disorder has emerged as a serious concern for public health; in particular, bipolar disorder has a less favorable prognosis than depression. Although prompt recognition of depression conversion to bipolar disorder is needed, early prediction is challenging due to overlapping symptoms. Recently, there have been attempts to develop a prediction model by using federated learning. Federated learning in medical fields is a method for training multi-institutional machine learning models without patient-level data sharing. OBJECTIVE: This study aims to develop and validate a federated, differentially private multi-institutional bipolar transition prediction model. METHODS: This retrospective study enrolled patients diagnosed with the first depressive episode at 5 tertiary hospitals in South Korea. We developed models for predicting bipolar transition by using data from 17,631 patients in 4 institutions. Further, we used data from 4541 patients for external validation from 1 institution. We created standardized pipelines to extract large-scale clinical features from the 4 institutions without any code modification. Moreover, we performed feature selection in a federated environment for computational efficiency and applied differential privacy to gradient updates. Finally, we compared the federated and the 4 local models developed with each hospital's data on internal and external validation data sets. RESULTS: In the internal data set, 279 out of 17,631 patients showed bipolar disorder transition. In the external data set, 39 out of 4541 patients showed bipolar disorder transition. The average performance of the federated model in the internal test (area under the curve [AUC] 0.726) and external validation (AUC 0.719) data sets was higher than that of the other locally developed models (AUC 0.642-0.707 and AUC 0.642-0.699, respectively). In the federated model, classifications were driven by several predictors such as the Charlson index (low scores were associated with bipolar transition, which may be due to younger age), severe depression, anxiolytics, young age, and visiting months (the bipolar transition was associated with seasonality, especially during the spring and summer months). CONCLUSIONS: We developed and validated a differentially private federated model by using distributed multi-institutional psychiatric data with standardized pipelines in a real-world environment. The federated model performed better than models using local data only.</p
Real-world treatment trajectories of adults with newly diagnosed asthma or COPD
Background There is a lack of knowledge on how patients with asthma or chronic obstructive pulmonary disease (COPD) are globally treated in the real world, especially with regard to the initial pharmacological treatment of newly diagnosed patients and the different treatment trajectories. This knowledge is important to monitor and improve clinical practice. Methods This retrospective cohort study aims to characterise treatments using data from four claims (drug dispensing) and four electronic health record (EHR; drug prescriptions) databases across six countries and three continents, encompassing 1.3 million patients with asthma or COPD. We analysed treatment trajectories at drug class level from first diagnosis and visualised these in sunburst plots. Results In four countries (USA, UK, Spain and the Netherlands), most adults with asthma initiate treatment with short-acting ß2 agonists monotherapy (20.8%-47.4% of first-line treatments). For COPD, the most frequent first-line treatment varies by country. The largest percentages of untreated patients (for asthma and COPD) were found in claims databases (14.5%-33.2% for asthma and 27.0%-52.2% for COPD) from the USA as compared with EHR databases (6.9%-15.2% for asthma and 4.4%-17.5% for COPD) from European countries. The treatment trajectories showed step-up as well as step-down in treatments. Conclusion Real-world data from claims and EHRs indicate that first-line treatments of asthma and COPD vary widely across countries. We found evidence of a stepwise approach in the pharmacological treatment of asthma and COPD, suggesting that treatments may be tailored to patients' needs.</p
- …