30 research outputs found

    Diabetes diagnostic prediction using vector support machines

    Get PDF
    The most important factors for the diagnosis of diabetes mellitus (DM) are age, body mass index (BMI) and blood glucose concentration. Diagnosis of DM by a doctor is complicated, because several factors are involved in the disease, and the diagnosis is subject to human error. A blood test does not provide enough information to make a correct diagnosis of the disease. A vector support machine (SVM) was implemented to predict the diagnosis of DM based on the factors mentioned in patients. The classes of the output variable are three: without diabetes, with a predisposition to diabetes and with diabetes. An SVM was obtained with an accuracy of 99.2% with Colombian patients and an accuracy of 65.6% with a data set of patients of a different ethnic background

    Diabetes Prediction Using Artificial Neural Network

    Get PDF
    Diabetes is one of the most common diseases worldwide where a cure is not found for it yet. Annually it cost a lot of money to care for people with diabetes. Thus the most important issue is the prediction to be very accurate and to use a reliable method for that. One of these methods is using artificial intelligence systems and in particular is the use of Artificial Neural Networks (ANN). So in this paper, we used artificial neural networks to predict whether a person is diabetic or not. The criterion was to minimize the error function in neural network training using a neural network model. After training the ANN model, the average error function of the neural network was equal to 0.01 and the accuracy of the prediction of whether a person is diabetics or not was 87.3

    Optimasi Parameter K pada Algoritma K-nearest Neighbour untuk Klasifikasi Penyakit Diabetes Mellitus

    Get PDF
    Diabetes Mellitus merupakan salah satu penyakit kronis yang mematikan. Penyakit yang juga dikenal dengan nama penyakit kencing manis ini terjadi akibat kadar glukosa di dalam darah terlalu tinggi. Diabetes Mellitus banyak diteliti di banyak negara pada saat ini karena peningkatan penderita yang banyak dan sangat mengkhawatirkan. Menurut WHO saat ini lebih dari 246 juta jiwa menderita diabetes dan diperkirakan akan meningkat menjadi 380 juta jiwa pada tahun 2025 apabila tidak dilakukan penanganan yang serius. Dibetes menyebabkan penyakit lain / komplikasi yang setiap tahunya mengakibatkan kematian hingga 3,8 juta jiwa. Data mining merupakan kegiatan menemukan sebuah pola, aturan dan pengetahuan baru dari sebuah dataset. Salah satu fungsi mayor data mining adalah klasifikasi. KNN merupakan salah satu algoritma klasifikasi data mining terbaik dan banyak digunakan. Algoritma KNN bekerja dengan cara menghitung kedekatan data testing dengan keseluruhan data training. K dalam KNN merupakan variabel jumlah tetangga terdekat yang akan diambil untuk proses klasifikasi. Jumlah K=1 akan membuat hasil klasifikasi terasa kalu karena hanya memperhitungkan satu tetangga terdekat atau satu record karakteristik data terdekat. Sedangkan jumlah K yang terlalu banyak akan menghasilkan klaasifikasi yang samar. Penelitian ini menghasilkan K terbaik pada percobaan K=13 dengan akurasi 75,14%. K=13 merupakan nilai k paling optimal diantara percobaan klasifikasi KNN menggunakan nilai K=1 sampai dengan K=49

    RB-Bayes algorithm for the prediction of diabetic in Pima Indian dataset

    Get PDF
    Diabetes is a major concern all over the world. It is increasing at a fast pace. People can avoid diabetes at an early stage without any test. The goal of this paper is to predict the probability of whether the person has a risk of diabetes or not at an early stage. This would lead to having a great impact on their quality of human life. The datasets are Pima Indians diabetes and Cleveland coronary illness and consist of 768 records. Though there are a number of solutions available for information extraction from a huge datasets and to predict the possibility of having diabetes, but the accuracy of their mining process is far from accurate. For achieving highest accuracy, the issue of zero probability which is generally faced by naïve bayes analysis needs to be addressed suitably. The proposed framework RB-Bayes aims to extract the required information with high accuracy that could survive the problem of zero probability and also configure accuracy with other methods like Support Vector Machine, Naive Bayes, and K Nearest Neighbor. We calculated mean to handle missing data and calculated probability for yes (positive) and no (negative). The highest value between yes and no decide the value for the tuple. It is mostly used in text classification. The outcomes on Pima Indian diabetes dataset demonstrate that the proposed methodology enhances the precision as a contrast with other regulated procedures. The accuracy of the proposed methodology large dataset is 72.9%

    A survey on diagnosis of diabetes using various classification algorithm

    Get PDF
    Diabetes is worldwide problem. It occurs when pancreas does not produce sufficient insulin, or body can not sufficiently use insulin it produces. Diabetes person has increase blood glucose in the body. People with diabetes may develop serious problems such as heart disease, stroke, kidney failure, blindness, and premature death. WHO reported, in 2013 it was found that over 382 million people throughout the world had diabetes and mostly occurred in women than men due to improper food habit or low quality of food. Early diagnosis of diabetes is an important challenge. This survey present various classification are used for diagnosis of diabetes such as artificial neural network, support vector machine, naïve bayes, decision tree. PIMA Indian dataset are chosen for diagnosis of diabetes. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients

    Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining

    Get PDF
    OBJECTIVES: The aim of this study was to examine a predictive model using features related to the diabetes type 2 risk factors. METHODS: The data were obtained from a database in a diabetes control system in Tabriz, Iran. The data included all people referred for diabetes screening between 2009 and 2011. The features considered as "Inputs" were: age, sex, systolic and diastolic blood pressure, family history of diabetes, and body mass index (BMI). Moreover, we used diagnosis as "Class". We applied the "Decision Tree" technique and "J48" algorithm in the WEKA (3.6.10 version) software to develop the model. RESULTS: After data preprocessing and preparation, we used 22,398 records for data mining. The model precision to identify patients was 0.717. The age factor was placed in the root node of the tree as a result of higher information gain. The ROC curve indicates the model function in identification of patients and those individuals who are healthy. The curve indicates high capability of the model, especially in identification of the healthy persons. CONCLUSIONS: We developed a model using the decision tree for screening T2DM which did not require laboratory tests for T2DM diagnosis

    A Comparative Analysis on the Evaluation of Classification Algorithms in the Prediction of Diabetes

    Get PDF
    Data mining techniques are applied in many applications as a standard procedure for analyzing the large volume of available data, extracting useful information and knowledge to support the major decision-making processes. Diabetes mellitus is a continuing, general, deadly syndrome occurring all around the world. It is characterized by hyperglycemia occurring due to abnormalities in insulin secretion which would in turn result in irregular rise of glucose level. In recent years, the impact of Diabetes mellitus has increased to a great extent especially in developing countries like India. This is mainly due to the irregularities in the food habits and life style. Thus, early diagnosis and classification of this deadly disease has become an active area of research in the last decade. Numerous clustering and classifications techniques are available in the literature to visualize temporal data to identify trends for controlling diabetes mellitus. This work presents an experimental study of several algorithms which classifies Diabetes Mellitus data effectively. The existing algorithms are analyzed thoroughly to identify their advantages and limitations. The performance assessment of the existing algorithms is carried out to determine the best approach

    Classifiers Evaluation: Comparison of Performance Classifiers Based on Tuples Amount

    Get PDF
    The  aim  of  this  study  is  to  compare  some classifiers’ performance related to the tuples amount. The different metrics of performance has been considered, such as: Accuracy, Mean Absolute Error (MAE), and Kappa Statistic. In this research, the different numbers of tuples are considered as well. The readmission process dataset of Diabetic patients, which has been experimented, consists of 47 features and 49.736 tuples. The  methodology  of  this  research  starts  from  preprocessing phase. After that, the clean dataset is divided into 5 subsets which represent every multiple of 10.000 tuples randomly. Each particular subset will be validated by three traditional classifiers i.e. Naive Bayes, K-Nearest Neighbor (k-NN), and Decision Tree. We also implement some setting parameters of each classifier except Naïve Bayes. Validation method used in this research is 10-Fold Cross-Validation. As the final conclusion, we compare the performance of classifiers based on the number of tuples. Our study indicates that the more the number of tuples, the lower and weaker the MAE and Accuracy performances whereas the kappa statistic performance tend to be fluctuated. Our study also found that Naïve Bayes outperforms k-NN and Decision Tree in overall. The top classifiers performances were reached in a 20.000-tuple evaluation.The aim of this study  is to compare some classifiers’ performance related to the tuples amount. The different metrics of performance has been considered, such as: Accuracy, Mean Absolute Error (MAE), and Kappa Statistic. In this research, the different numbers of tuples are considered as well. The

    Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi Pada Readmisi Pasien Diabetes

    Get PDF
    . Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers
    corecore