11,183 research outputs found
A Comparative Performance Analysis of Hybrid and Classical Machine Learning Method in Predicting Diabetes
Diabetes mellitus is one of medical science’s most important research topics because of the disease’s severe consequences. High blood glucose levels characterize it. Early detection of diabetes is made possible by machine learning techniques with their intelligent capabilities to accurately predict diabetes and prevent its complications. Therefore, this study aims to find a machine learning approach that can more accurately predict diabetes. This study compares the performance of various classical machine learning models with the hybrid machine learning approach. The hybrid model includes the homogenous model, which comprises Random Forest, AdaBoost, XGBoost, Extra Trees, Gradient Booster, and the heterogeneous model that uses stacking ensemble methods. The stacking ensemble or stacked generalization approach is a meta-classifier in which multiple learners collaborate for prediction. The performance of the homogeneous hybrid models, Stacked Generalization and the classic machine learning methods such as Naive Bayes and Multilayer Perceptron, k-Nearest Neighbour, and support vector machine are compared. The experimental analysis using Pima Indians and the early-stage diabetes dataset demonstrates that the hybrid models achieve higher accuracy in diagnosing diabetes than the classical models. In the comparison of all the hybrid models, the heterogeneous model using the Stacked Generalization approach outperformed other models by achieving 83.9% and 98.5%. Doi: 10.28991/ESJ-2023-07-01-08 Full Text: PD
An Advanced Conceptual Diagnostic Healthcare Framework for Diabetes and Cardiovascular Disorders
The data mining along with emerging computing techniques have astonishingly
influenced the healthcare industry. Researchers have used different Data Mining
and Internet of Things (IoT) for enrooting a programmed solution for diabetes
and heart patients. However, still, more advanced and united solution is needed
that can offer a therapeutic opinion to individual diabetic and cardio
patients. Therefore, here, a smart data mining and IoT (SMDIoT) based advanced
healthcare system for proficient diabetes and cardiovascular diseases have been
proposed. The hybridization of data mining and IoT with other emerging
computing techniques is supposed to give an effective and economical solution
to diabetes and cardio patients. SMDIoT hybridized the ideas of data mining,
Internet of Things, chatbots, contextual entity search (CES), bio-sensors,
semantic analysis and granular computing (GC). The bio-sensors of the proposed
system assist in getting the current and precise status of the concerned
patients so that in case of an emergency, the needful medical assistance can be
provided. The novelty lies in the hybrid framework and the adequate support of
chatbots, granular computing, context entity search and semantic analysis. The
practical implementation of this system is very challenging and costly.
However, it appears to be more operative and economical solution for diabetes
and cardio patients.Comment: 11 PAGE
Processing of Electronic Health Records using Deep Learning: A review
Availability of large amount of clinical data is opening up new research
avenues in a number of fields. An exciting field in this respect is healthcare,
where secondary use of healthcare data is beginning to revolutionize
healthcare. Except for availability of Big Data, both medical data from
healthcare institutions (such as EMR data) and data generated from health and
wellbeing devices (such as personal trackers), a significant contribution to
this trend is also being made by recent advances on machine learning,
specifically deep learning algorithms
Recommended from our members
Machine Learning Framework to Identify Individuals at Risk of Rapid Progression of Coronary Atherosclerosis: From the PARADIGM Registry.
Background Rapid coronary plaque progression (RPP) is associated with incident cardiovascular events. To date, no method exists for the identification of individuals at risk of RPP at a single point in time. This study integrated coronary computed tomography angiography-determined qualitative and quantitative plaque features within a machine learning (ML) framework to determine its performance for predicting RPP. Methods and Results Qualitative and quantitative coronary computed tomography angiography plaque characterization was performed in 1083 patients who underwent serial coronary computed tomography angiography from the PARADIGM (Progression of Atherosclerotic Plaque Determined by Computed Tomographic Angiography Imaging) registry. RPP was defined as an annual progression of percentage atheroma volume ≥1.0%. We employed the following ML models: model 1, clinical variables; model 2, model 1 plus qualitative plaque features; model 3, model 2 plus quantitative plaque features. ML models were compared with the atherosclerotic cardiovascular disease risk score, Duke coronary artery disease score, and a logistic regression statistical model. 224 patients (21%) were identified as RPP. Feature selection in ML identifies that quantitative computed tomography variables were higher-ranking features, followed by qualitative computed tomography variables and clinical/laboratory variables. ML model 3 exhibited the highest discriminatory performance to identify individuals who would experience RPP when compared with atherosclerotic cardiovascular disease risk score, the other ML models, and the statistical model (area under the receiver operating characteristic curve in ML model 3, 0.83 [95% CI 0.78-0.89], versus atherosclerotic cardiovascular disease risk score, 0.60 [0.52-0.67]; Duke coronary artery disease score, 0.74 [0.68-0.79]; ML model 1, 0.62 [0.55-0.69]; ML model 2, 0.73 [0.67-0.80]; all P<0.001; statistical model, 0.81 [0.75-0.87], P=0.128). Conclusions Based on a ML framework, quantitative atherosclerosis characterization has been shown to be the most important feature when compared with clinical, laboratory, and qualitative measures in identifying patients at risk of RPP
Predicting diabetes-related hospitalizations based on electronic health records
OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip
Predicting Short-term and Long-term HbA1c Response after Insulin Initiation in Patients with Type 2 Diabetes Mellitus using Machine Learning
AIM: To assess the potential of supervised machine learning techniques to identify clinical variables for predicting short-term and long-term glycated hemoglobin (HbA1c) response after insulin treatment initiation in patients with type 2 diabetes mellitus (T2DM). MATERIALS AND METHODS: We included patients with T2DM from the Groningen Initiative to ANalyze Type 2 diabetes Treatment (GIANTT) database who started insulin treatment between 2007-2013 with a minimum follow-up of 2 years. Short-term and long-term response were defined at 6 (± 2) and 24 (± 2) months after insulin initiation, respectively. Patients were defined as good responders if they had a decrease in HbA1c ≥ 5mmol/mol or reached the recommended level of HbA1c ≤ 53 mmol/mol. Twenty-four baseline clinical variables were used for the analysis and elastic net regularization technique was used for variables selection. The performance of three traditional machine learning algorithms was compared to predict short-term and long-term responses and the area under the receiver operator characteristic curve (AUC) was used to assess the performance of the prediction model. RESULTS: The elastic net regularization based generalized linear model, including baseline HbA1c and eGFR, correctly classified short-term and long-term HbA1c response after treatment initiation with an AUC (95% CI) = 0.80 (0.78 - 0.83) and 0.81 (0.79 - 0.84), respectively, and outperformed other machine learning algorithms. Using baseline HbA1c alone, an AUC = 0.71 (0.65 - 0.73) and 0.72 (0.66 - 0.75) was obtained for predicting short-term and long-term response, respectively. CONCLUSIONS: Machine-learning algorithm performed well in the prediction of an individual's short-term and long-term HbA1c response using baseline clinical variables. This article is protected by copyright. All rights reserved
Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks
Predicting the future health information of patients from the historical
Electronic Health Records (EHR) is a core research task in the development of
personalized healthcare. Patient EHR data consist of sequences of visits over
time, where each visit contains multiple medical codes, including diagnosis,
medication, and procedure codes. The most important challenges for this task
are to model the temporality and high dimensionality of sequential EHR data and
to interpret the prediction results. Existing work solves this problem by
employing recurrent neural networks (RNNs) to model EHR data and utilizing
simple attention mechanism to interpret the results. However, RNN-based
approaches suffer from the problem that the performance of RNNs drops when the
length of sequences is large, and the relationships between subsequent visits
are ignored by current RNN-based approaches. To address these issues, we
propose {\sf Dipole}, an end-to-end, simple and robust model for predicting
patients' future health information. Dipole employs bidirectional recurrent
neural networks to remember all the information of both the past visits and the
future visits, and it introduces three attention mechanisms to measure the
relationships of different visits for the prediction. With the attention
mechanisms, Dipole can interpret the prediction results effectively. Dipole
also allows us to interpret the learned medical code representations which are
confirmed positively by medical experts. Experimental results on two real world
EHR datasets show that the proposed Dipole can significantly improve the
prediction accuracy compared with the state-of-the-art diagnosis prediction
approaches and provide clinically meaningful interpretation
- …