58 research outputs found
Comparing Performance of Data Mining Algorithms in Prediction Heart Diseases
Heart diseases are among the nation’s leading couse of mortality and moribidity. Data mining teqniques can predict the likelihood of patients getting a heart disease. The purpose of this study is comparison of different data mining algorithm on prediction of heart diseases. This work applied and compared data mining techniques to predict the risk of heart diseases. After feature analysis, models by five algorithms including decision tree (C5.0), neural network, support vector machine (SVM), logistic regression and k-nearest neighborhood (KNN) were developed and validated. C5.0 Decision tree has been able to build a model with greatest accuracy 93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23% respectively. Produced results of decision tree can be simply interpretable and applicable; their rules can be understood easily by different clinical practitioner
Using Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach
Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descriptive and predictive techniques of data mining that aims to aid specialists in the healthcare system to effectively predict patients with Coronary Artery Disease (CAD). To achieve this objective, some clustering and classification techniques are used. First, the number of clusters are determined using clustering indexes. Next, some types of decision tree methods and Artificial Neural Network (ANN) are applied to each cluster in order to predict CAD patients. Finally, results obtained show that the C&RT decision tree method performs best on all data used in this study with 0.074 error. All data used in this study are real and are collected from a heart clinic database
Minimum Relevant Features to Obtain Explainable Systems for Predicting Cardiovascular Disease Using the Statlog Data Set
Learning systems have been focused on creating models capable of obtaining the best results in error metrics. Recently, the focus has shifted to improvement in the interpretation and explanation of the results. The need for interpretation is greater when these models are used to support decision making. In some areas, this becomes an indispensable requirement, such as in medicine. The goal of this study was to define a simple process to construct a system that could be easily interpreted based on two principles: (1) reduction of attributes without degrading the performance of the prediction systems and (2) selecting a technique to interpret the final prediction system. To describe this process, we selected a problem, predicting cardiovascular disease, by analyzing the well-known Statlog (Heart) data set from the University of California’s Automated Learning Repository. We analyzed the cost of making predictions easier to interpret by reducing the number of features that explain the classification of health status versus the cost in accuracy. We performed an analysis on a large set of classification techniques and performance metrics, demonstrating that it is possible to construct explainable and reliable models that provide high quality predictive performance
The methods of duo output neural network ensemble for prediction of coronary heart disease
The occurrence of Coronary heart disease (CHD) is hard to predict yet, but the assessment of CHD risk for the next ten years is possible. The prediction of coronary heart disease can be modelled using multi-layer perceptron neural network (MLP-ANN). Prediction model with MLP-ANN has either positive or negative CHD output, which is a binary classification. A prediction model with binary classification requires determination of threshold value before the classification process which increases the uncertainty in the classification process. Another weakness of the MLP-ANN model is the presence of overfitting. This study proposes a prediction model for coronary heart disease using the duo output artificial neural network ensemble (DOANNE) method to overcome the problems of overfitting and uncertainty of classification in MLP-ANN. This research method was divided into several stages, namely data acquisition, pre-processing, modelling into DOANNE, neural network ensemble training with Levenberg-Marquard (LM) algorithm, system performance testing, and evaluation. The results of the study showed that the use of DOANNE-LM method was able to provide a significant improvement from the MLP-ANN method, indicated by the results of statistical tests with p-value <0.05
The Analysis of Performace Model Tiered Artificial Neural Network for Assessment of Coronary Heart Disease
The assessment model of coronary heart disease is so much developed in line with the development of information technology, particularly the field of artificial intelligence. Unfortunately, the assessment models developed mostly do not use such an approach made by the clinician, the tiered approach. This study aims to analyze the performance of a tiered model assessment. The method used for each level is, preprocessing, building architecture artificial neural network (ANN), conduct training using the Levenberg-Marquardt algorithm and one step secant, as well as testing the system. The study is divided into the terms of the stages in the examination procedure. The test results showed the influence of each level, both when the output level of the previous positive or negative, were tested back at the next level. The performance evaluation may indicate that the top level provides performance improvement and or reinforce the previous level.
A Predictive Model for Assessment of Successful Outcome in Posterior Spinal Fusion Surgery
Background: Low back pain is a common problem in many people. Neurosurgeons recommend posterior spinal fusion (PSF) surgery as one of the therapeutic strategies to the patients with low back pain. Due to the high risk of this type of surgery and the critical importance of making the right decision, accurate prediction of the surgical outcome is one of the main concerns for the neurosurgeons.Methods: In this study, 12 types of multi-layer perceptron (MLP) networks and 66 radial basis function (RBF) networks as the types of artificial neural network methods and a logistic regression (LR) model created and compared to predict the satisfaction with PSF surgery as one of the most well-known spinal surgeries.Results: The most important clinical and radiologic features as twenty-seven factors for 480 patients (150 males, 330 females; mean age 52.32 ± 8.39 years) were considered as the model inputs that included: age, sex, type of disorder, duration of symptoms, job, walking distance without pain (WDP), walking distance without sensory (WDS) disorders, visual analog scale (VAS) scores, Japanese Orthopaedic Association (JOA) score, diabetes, smoking, knee pain (KP), pelvic pain (PP), osteoporosis, spinal deformity and etc. The indexes such as receiver operating characteristic–area under curve (ROC-AUC), positive predictive value, negative predictive value and accuracy calculated to determine the best model. Postsurgical satisfaction was 77.5% at 6 months follow-up. The patients divided into the training, testing, and validation data sets.Conclusion: The findings showed that the MLP model performed better in comparison with RBF and LR models for prediction of PSF surgery.Keywords: Posterior spinal fusion surgery (PSF); Prediction, Surgical satisfaction; Multi-layer perceptron (MLP); Logistic regression (LR) (PDF) A Predictive Model for Assessment of Successful Outcome in Posterior Spinal Fusion Surgery. Available from: https://www.researchgate.net/publication/325679954_A_Predictive_Model_for_Assessment_of_Successful_Outcome_in_Posterior_Spinal_Fusion_Surgery [accessed Jul 11 2019].Peer reviewe
Enhancing coronary artery diseases screening:A comprehensive assessment of machine learning approaches using routine clinical and laboratory data
Introduction: Coronary artery disease (CAD) stands among the leading global causes of mortality, underscoring the critical necessity for early detection to facilitate effective treatment. Although Coronary Angiography (CA) serves as the gold standard for diagnosis, its limitations for screening, including side effects and cost, necessitate alternative approaches. This study focuses on the development and comparison of machine learning techniques as substitutes for CA in CAD screening, leveraging routine clinical and laboratory data. Material and Methods: Various machine learning classification algorithms—decision tree, k-nearest neighbor, artificial neural network, support vector machine, logistic regression, and stacked ensemble learning were employed to differentiate CAD and healthy subjects. Feature selection algorithms, namely LASSO and ReliefF, were utilized to prioritize relevant features. A range of evaluation metrics, including accuracy, precision, sensitivity, specificity, AUC, F1 score, ROC curve, and NPV, were applied. The SHAP technique was employed to elucidate and interpret the artificial neural network model. Results: The artificial neural network, support vector machine, and stacked ensemble learning models demonstrated excellent results in a 10-fold cross-validation evaluation using features selected by LASSO and ReliefF. With the LASSO feature selection algorithm, these models achieved accuracies of 90.38%, 90.07%, and 90.39%, sensitivities of 94.43%, 93.03%, and 93.96%, and specificities of 80.27%, 82.77%, and 81.52%, respectively. Using ReliefF, the accuracies were 88.79%, 88.77%, and 90.06%, sensitivities were 92.12%, 91.66%, and 93.98%, and specificities were 80.13%, 81.38%, and 80.13%, respectively. The SHAP technique revealed that typical and atypical chest pain, hypertension, diabetes mellitus, T inversion, and age were the most influential features in the neural network model. Conclusion: The machine learning models developed in this study exhibit high potential for non-invasive screening and diagnosis of CAD in the Z-Alizadeh Sani dataset. However, further studies are essential to validate and apply these models in real-world and clinical settings.</p
Recommended from our members
Analysis for warning factors of type 2 diabetes mellitus complications with Markov blanket based on a Bayesian network model
Background and objective
Type 2 diabetes mellitus (T2DM) complications seriously affect the quality of life and could not be cured completely. Actions should be taken for prevention and self-management. Analysis of warning factors is beneficial for patients, on which some previous studies focused. They generally used the professional medical test factors or complete factors to predict and prevent, but it was inconvenient and impractical for patients to self-manage. With this in mind, this study built a Bayesian network (BN) model, from the perspective of diabetic patients’ self-management and prevention, to predict six complications of T2DM using the selected warning factors which patients could have access from medical examination. Furthermore, the model was analyzed to explore the relationships between physiological variables and T2DM complications, as well as the complications themselves. The model aims to help patients with T2DM self-manage and prevent themselves from complications.
Methods
The dataset was collected from a well-known data center called the National Health Clinical Center between 1st January 2009 and 31st December 2009. After preprocess and impute the data, a BN model merging expert knowledge was built with Bootstrap and Tabu search algorithm. Markov Blanket (MB) was used to select the warning factors and predict T2DM complications. Moreover, a Bayesian network without prior information (BN-wopi) model learned using 10-fold cross-validation both in structure and in parameters was added to compare with other classifiers learned using 10-fold cross-validation fairly. The warning factors were selected according the structure learned in each fold and were used to predict. Finally, the performance of two BN models using warning features were compared with Naïve Bayes model, Random Forest model, and C5.0 Decision Tree model, which used all features to predict. Besides, the validation parameters of the proposed model were also compared with those in existing studies using some other variables in clinical data or biomedical data to predict T2DM complications.
Results
Experimental results indicated that the BN models using warning factors performed statistically better than their counterparts using all other variables in predicting T2DM complications. In addition, the proposed BN model were effective and significant in predicting diabetic nephropathy (DN) (AUC: 0.831), diabetic foot (DF) (AUC: 0.905), diabetic macrovascular complications (DMV) (AUC: 0.753) and diabetic ketoacidosis (DK) (AUC: 0.877) with the selected warning factors compared with other experiments.
Conclusions
The warning factors of DN, DF, DMV, and DK selected by MB in this research might be able to help predict certain T2DM complications effectively, and the proposed BN model might be used as a general tool for prevention, monitoring, and self-management
- …