Search CORE

11,183 research outputs found

A Comparative Performance Analysis of Hybrid and Classical Machine Learning Method in Predicting Diabetes

Author: Anbananthen Kalaiarasi Sonai Muthu
Busst Mikail Bin Muhammad Azman
Kannan Rajkumar
Kannan Subarmaniam
Publication venue: 'Ital Publication'
Publication date: 12/10/2022
Field of study

Diabetes mellitus is one of medical science’s most important research topics because of the disease’s severe consequences. High blood glucose levels characterize it. Early detection of diabetes is made possible by machine learning techniques with their intelligent capabilities to accurately predict diabetes and prevent its complications. Therefore, this study aims to find a machine learning approach that can more accurately predict diabetes. This study compares the performance of various classical machine learning models with the hybrid machine learning approach. The hybrid model includes the homogenous model, which comprises Random Forest, AdaBoost, XGBoost, Extra Trees, Gradient Booster, and the heterogeneous model that uses stacking ensemble methods. The stacking ensemble or stacked generalization approach is a meta-classifier in which multiple learners collaborate for prediction. The performance of the homogeneous hybrid models, Stacked Generalization and the classic machine learning methods such as Naive Bayes and Multilayer Perceptron, k-Nearest Neighbour, and support vector machine are compared. The experimental analysis using Pima Indians and the early-stage diabetes dataset demonstrates that the hybrid models achieve higher accuracy in diagnosing diabetes than the classical models. In the comparison of all the hybrid models, the heterogeneous model using the Stacked Generalization approach outperformed other models by achieving 83.9% and 98.5%. Doi: 10.28991/ESJ-2023-07-01-08 Full Text: PD

Emerging Science Journal (ESJ)

SHDL@MMU Digital Repository

An Advanced Conceptual Diagnostic Healthcare Framework for Diabetes and Cardiovascular Disorders

Author: Sharma M.
Singh G.
Singh R.
Publication venue: 'European Alliance for Innovation n.o.'
Publication date: 01/06/2018
Field of study

The data mining along with emerging computing techniques have astonishingly influenced the healthcare industry. Researchers have used different Data Mining and Internet of Things (IoT) for enrooting a programmed solution for diabetes and heart patients. However, still, more advanced and united solution is needed that can offer a therapeutic opinion to individual diabetic and cardio patients. Therefore, here, a smart data mining and IoT (SMDIoT) based advanced healthcare system for proficient diabetes and cardiovascular diseases have been proposed. The hybridization of data mining and IoT with other emerging computing techniques is supposed to give an effective and economical solution to diabetes and cardio patients. SMDIoT hybridized the ideas of data mining, Internet of Things, chatbots, contextual entity search (CES), bio-sensors, semantic analysis and granular computing (GC). The bio-sensors of the proposed system assist in getting the current and precise status of the concerned patients so that in case of an emergency, the needful medical assistance can be provided. The novelty lies in the hybrid framework and the adequate support of chatbots, granular computing, context entity search and semantic analysis. The practical implementation of this system is very challenging and costly. However, it appears to be more operative and economical solution for diabetes and cardio patients.Comment: 11 PAGE

arXiv.org e-Print Archive

Directory of Open Access Journals

Processing of Electronic Health Records using Deep Learning: A review

Author: Danieletto Matteo
Dudley Joel
Glicksberg Benjamin
Li Li
Mayora Oscar
Osmani Venet
Publication venue
Publication date: 05/04/2018
Field of study

Availability of large amount of clinical data is opening up new research avenues in a number of fields. An exciting field in this respect is healthcare, where secondary use of healthcare data is beginning to revolutionize healthcare. Except for availability of Big Data, both medical data from healthcare institutions (such as EMR data) and data generated from health and wellbeing devices (such as personal trackers), a significant contribution to this trend is also being made by recent advances on machine learning, specifically deep learning algorithms

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

Recommended from our members

Machine Learning Framework to Identify Individuals at Risk of Rapid Progression of Coronary Atherosclerosis: From the PARADIGM Registry.

Author: Al'Aref Subhi J
Andreini Daniele
Baskaran Lohendran
Bax Jeroen J
Berman Daniel S
Budoff Matthew J
Cademartiri Filippo
Chang Hyuk-Jae
Chinnaiyan Kavitha
Choi Jung Hyun
Chun Eun Ju
Conte Edoardo
de Araújo Gonçalves Pedro
Gottlieb Ilan
Gransar Heidi
Hadamitzky Martin
Han Donghee
Kim Yong-Jin
Kolli Kranthi K
Lee Byoung Kwon
Lee Sang-Eun
Leipsic Jonathon A
Lin Fay Y
Maffei Erica
Marques Hugo
Min James K
Narula Jagat
Pontone Gianluca
Raff Gilbert L
Samady Habib
Shaw Leslee J
Shin Sangshoon
Stone Peter
Sung Ji Min
van Rosendael Alexander R
Virmani Renu
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Background Rapid coronary plaque progression (RPP) is associated with incident cardiovascular events. To date, no method exists for the identification of individuals at risk of RPP at a single point in time. This study integrated coronary computed tomography angiography-determined qualitative and quantitative plaque features within a machine learning (ML) framework to determine its performance for predicting RPP. Methods and Results Qualitative and quantitative coronary computed tomography angiography plaque characterization was performed in 1083 patients who underwent serial coronary computed tomography angiography from the PARADIGM (Progression of Atherosclerotic Plaque Determined by Computed Tomographic Angiography Imaging) registry. RPP was defined as an annual progression of percentage atheroma volume ≥1.0%. We employed the following ML models: model 1, clinical variables; model 2, model 1 plus qualitative plaque features; model 3, model 2 plus quantitative plaque features. ML models were compared with the atherosclerotic cardiovascular disease risk score, Duke coronary artery disease score, and a logistic regression statistical model. 224 patients (21%) were identified as RPP. Feature selection in ML identifies that quantitative computed tomography variables were higher-ranking features, followed by qualitative computed tomography variables and clinical/laboratory variables. ML model 3 exhibited the highest discriminatory performance to identify individuals who would experience RPP when compared with atherosclerotic cardiovascular disease risk score, the other ML models, and the statistical model (area under the receiver operating characteristic curve in ML model 3, 0.83 [95% CI 0.78-0.89], versus atherosclerotic cardiovascular disease risk score, 0.60 [0.52-0.67]; Duke coronary artery disease score, 0.74 [0.68-0.79]; ML model 1, 0.62 [0.55-0.69]; ML model 2, 0.73 [0.67-0.80]; all P<0.001; statistical model, 0.81 [0.75-0.87], P=0.128). Conclusions Based on a ML framework, quantitative atherosclerosis characterization has been shown to be the most important feature when compared with clinical, laboratory, and qualitative measures in identifying patients at risk of RPP

eScholarship - University of California

Predicting diabetes-related hospitalizations based on electronic health records

Author: Brisimi Theodora S.
Dai Wuyang
Paschalidis Ioannis Ch.
Wang Taiyao
Xu Tingting
Publication venue: 'SAGE Publications'
Publication date: 01/12/2019
Field of study

OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip

Boston University Institutional Repository (OpenBU)

Predicting Short-term and Long-term HbA1c Response after Insulin Initiation in Patients with Type 2 Diabetes Mellitus using Machine Learning

Author: Denig Petra
Nagaraj Sunil B
Sidorenkov Grigory
van Boven Job F M
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

AIM: To assess the potential of supervised machine learning techniques to identify clinical variables for predicting short-term and long-term glycated hemoglobin (HbA1c) response after insulin treatment initiation in patients with type 2 diabetes mellitus (T2DM). MATERIALS AND METHODS: We included patients with T2DM from the Groningen Initiative to ANalyze Type 2 diabetes Treatment (GIANTT) database who started insulin treatment between 2007-2013 with a minimum follow-up of 2 years. Short-term and long-term response were defined at 6 (± 2) and 24 (± 2) months after insulin initiation, respectively. Patients were defined as good responders if they had a decrease in HbA1c ≥ 5mmol/mol or reached the recommended level of HbA1c ≤ 53 mmol/mol. Twenty-four baseline clinical variables were used for the analysis and elastic net regularization technique was used for variables selection. The performance of three traditional machine learning algorithms was compared to predict short-term and long-term responses and the area under the receiver operator characteristic curve (AUC) was used to assess the performance of the prediction model. RESULTS: The elastic net regularization based generalized linear model, including baseline HbA1c and eGFR, correctly classified short-term and long-term HbA1c response after treatment initiation with an AUC (95% CI) = 0.80 (0.78 - 0.83) and 0.81 (0.79 - 0.84), respectively, and outperformed other machine learning algorithms. Using baseline HbA1c alone, an AUC = 0.71 (0.65 - 0.73) and 0.72 (0.66 - 0.75) was obtained for predicting short-term and long-term response, respectively. CONCLUSIONS: Machine-learning algorithm performed well in the prediction of an individual's short-term and long-term HbA1c response using baseline clinical variables. This article is protected by copyright. All rights reserved

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

Author: Ba Jimmy
Cheng Yu
Choi Edward
Lipton Zachary C
Suo Qiuling
Wang Xiang
Xu Kelvin
Zeiler Matthew D
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/06/2017
Field of study

Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit contains multiple medical codes, including diagnosis, medication, and procedure codes. The most important challenges for this task are to model the temporality and high dimensionality of sequential EHR data and to interpret the prediction results. Existing work solves this problem by employing recurrent neural networks (RNNs) to model EHR data and utilizing simple attention mechanism to interpret the results. However, RNN-based approaches suffer from the problem that the performance of RNNs drops when the length of sequences is large, and the relationships between subsequent visits are ignored by current RNN-based approaches. To address these issues, we propose {\sf Dipole}, an end-to-end, simple and robust model for predicting patients' future health information. Dipole employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and it introduces three attention mechanisms to measure the relationships of different visits for the prediction. With the attention mechanisms, Dipole can interpret the prediction results effectively. Dipole also allows us to interpret the learned medical code representations which are confirmed positively by medical experts. Experimental results on two real world EHR datasets show that the proposed Dipole can significantly improve the prediction accuracy compared with the state-of-the-art diagnosis prediction approaches and provide clinically meaningful interpretation

arXiv.org e-Print Archive

Crossref