Search CORE

62 research outputs found

Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification

Author: Mustafa Serter Uzer
Nihat Yilmaz
Onur Inan
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

This paper offers a hybrid approach that uses the artificial bee colony (ABC) algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications

Crossref

Directory of Open Access Journals

A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method

Author: Mo Zhang
Qian Wang
Qiang Su
Qiugen Wang
Xiao Liu
Xiaoli Wang
Yanhong Zhu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques

Crossref

Directory of Open Access Journals

An Optimized Recursive General Regression Neural Network Oracle for the Prediction and Diagnosis of Diabetes

Author: Dana Bani-Hani
Pruthak Patel
Tasneem Alshaikh
Publication venue: Global Journals Inc. (US)
Publication date: 15/05/2019
Field of study

Diabetes is a serious, chronic disease that has been seeing a rise in the number of cases and prevalence over the past few decades. It can lead to serious complications and can increase the overall risk of dying prematurely. Data-oriented prediction models have become effective tools that help medical decision-making and diagnoses in which the use of machine learning in medicine has increased substantially. This research introduces the Recursive General Regression Neural Network Oracle (R-GRNN Oracle) and is applied on the Pima Indians Diabetes dataset for the prediction and diagnosis of diabetes. The R-GRNN Oracle (Bani-Hani, 2017) is an enhancement to the GRNN Oracle developed by Masters et al. in 1998, in which the recursive model is created of two oracles: one within the other. Several classifiers, along with the R-GRNN Oracle and the GRNN Oracle, are applied to the dataset, they are: Support Vector Machine (SVM), Multilayer Perceptron (MLP), Probabilistic Neural Network (PNN), Gaussian NaEF;ve Bayes (GNB), K-Nearest Neighbor (KNN), and Random Forest (RF). Genetic Algorithm (GA) was used for feature selection as well as the hyperparameter optimization of SVM and MLP, and Grid Search (GS) was used to optimize the hyperparameters of KNN and RF. The performance metrics accuracy, AUC, sensitivity, and specificity were recorded for each classifier. The R-GRNN Oracle was able to achieve the highest accuracy, AUC, and sensitivity (81.14%, 86.03%, and 63.80%, respectively), while the optimized MLP had the highest specificity (89.71%)

Global Journal of Computer Science and Technology (GJCST)

Optimized Naïve Bayesian Algorithm for Efficient Performance

Author: Alhasan John
Georgina N. Obuandike
Isah Audu
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 27/03/2018
Field of study

Naïve Bayesian algorithm is a data mining algorithm that depicts relationship between data objects using probabilistic method. Classification using Bayesian algorithm is usually done by finding the class that has the highest probability value. Data mining is a popular research area that consists of algorithm development and pattern extraction from database using different algorithms. Classification is one of the major tasks of data mining which aimed at building a model (classifier) that can be used to predict unknown class labels. There are so many algorithms for classification such as decision tree classifier, neural network, rule induction and naïve Bayesian. This paper is focused on naïve Bayesian algorithm which is a classical algorithm for classifying categorical data. It easily converged at local optima. Particle Swarm Optimization (PSO) algorithm has gained recognition in many fields of human endeavours and has been applied to enhance efficiency and accuracy in different problem domain. This paper proposed an optimized naïve Bayesian classifier using particle swarm optimization to overcome the problem of premature convergence and to improve the efficiency of the naïve Bayesian algorithm. The classification result from the optimized naïve Bayesian when compared with the traditional algorithm showed a better performance Keywords: Data Mining, Classification, Particle Swarm Optimization, Naïve Bayesian

International Institute for Science, Technology and Education (IISTE): E-Journals

Computer–aided diagnosis of diabetes using least square support vector machine

Author
Publication venue: 'Science Publishing Corporation'
Publication date
Field of study

Crossref

Artificial Neural Network Parameter Tuning Framework For Heart Disease Classification

Author: Abu Yazid Mohamad Haider
Azman Novi
Satria Haikal
Talib Shukor
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 18/09/2019
Field of study

Heart Disease are among the leading cause of death worldwide. The application of artificial neural network as decision support tool for heart disease detection. However, artificial neural network required multitude of parameter setting in order to find the optimum parameter setting that produce the best performance. This paper proposed the parameter tuning framework for artificial neural network. Statlog heart disease dataset and Cleveland heart disease dataset is used to evaluate the performance of the proposed framework. The results show that the proposed framework able to produce high classification accuracy where the overall classification accuracy for Cleveland dataset is 90.9% and 90% for Statlog dataset

Proceeding of the Electrical Engineering Computer Science and Informatics

Predicting Diabetes Mellitus With Machine Learning Techniques

Author: Dehui Yin
Hua Tang
Kaiyang Qu
Quan Zou
Quan Zou
Yamei Luo
Ying Ju
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world’s diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes. There is no doubt that this alarming figure needs great attention. With the rapid development of machine learning, machine learning has been applied to many aspects of medical health. In this study, we used decision tree, random forest and neural network to predict diabetes mellitus. The dataset is the hospital physical examination data in Luzhou, China. It contains 14 attributes. In this study, five-fold cross validation was used to examine the models. In order to verity the universal applicability of the methods, we chose some methods that have the better performance to conduct independent test experiments. We randomly selected 68994 healthy people and diabetic patients’ data, respectively as training set. Due to the data unbalance, we randomly extracted 5 times data. And the result is the average of these five experiments. In this study, we used principal component analysis (PCA) and minimum redundancy maximum relevance (mRMR) to reduce the dimensionality. The results showed that prediction with random forest could reach the highest accuracy (ACC = 0.8084) when all the attributes were used

Directory of Open Access Journals

Frontiers - Publisher Connector

DMP_MI: an effective diabetes mellitus classification algorithm on imbalanced data with missing values

Author: Cao Weijia
Cheng Yongqiang
Davis Darryl N.
Guo Jiawei
Ren Jiadong
Wang Qian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/07/2019
Field of study

© 2019 Institute of Electrical and Electronics Engineers Inc.. All rights reserved. As a widely known chronic disease, diabetes mellitus is called a silent killer. It makes the body produce less insulin and causes increased blood sugar, which leads to many complications and affects the normal functioning of various organs, such as eyes, kidneys, and nerves. Although diabetes has attracted high attention in research, due to the existence of missing values and class imbalance in the data, the overall performance of diabetes classification using machine learning is relatively low. In this paper, we propose an effective Prediction algorithm for Diabetes Mellitus classification on Imbalanced data with Missing values (DMP_MI). First, the missing values are compensated by the Naïve Bayes (NB) method for data normalization. Then, an adaptive synthetic sampling method (ADASYN) is adopted to reduce the influence of class imbalance on the prediction performance. Finally, a random forest (RF) classifier is used to generate predictions and evaluated using comprehensive set of evaluation indicators. Experiments performed on Pima Indians diabetes dataset from the University of California at Irvine, Irvine (UCI) Repository, have demonstrated the effectiveness and superiority of our proposed DMP_MI

Repository@Hull - Worktribe