20 research outputs found

    Bagging model with cost sensitive analysis on diabetes data

    Get PDF
    Diabetes patients might suffer from an unhealthy life, long-term treatment and chronic complicated diseases. The decreasing hospitalization rate is a crucial problem for health care centers. This study combines the bagging method with base classifier decision tree and costs-sensitive analysis for diabetes patients' classification purpose. Real patients' data collected from a regional hospital in Thailand were analyzed. The relevance factors were selected and used to construct base classifier decision tree models to classify diabetes and non-diabetes patients. The bagging method was then applied to improve accuracy. Finally, asymmetric classification cost matrices were used to give more alternative models for diabetes data analysis

    SISTEM APLIKASI PREDIKSI PENYAKIT DIABETES MENGGUNAKAN FITURE SELECTION KORELASI PEARSON DAN KLASIFIKASI NAÏVE BAYES

    Get PDF
    Diabetes penyakit serius yang terkenal didunia dengan sebutan silent killer yang tercatat semakin meningkat dari tahun 1980 sampai 2014 sebanyak 422juta jiwa. Hal ini perlu di perhatikan secara serius karena berimbas pada timbulnya beban kerja sumberdaya medis yang berlebihan serta tentunya beban keuangan yang akan timbul karena hal tersebut. Pada era teknologi maju saat ini data mining pada bidang kesehatan hadir untuk memberikaran analisa data khususnya data penyakit diabetes dengan tepat dan akurat. Klasifikasi data mining dipadukan dengan metode research and development dapat digunakan sebagai sebuah sistem aplikasi untuk memprediksi penyakit diabetes. Oleh sebab itu peneliti membuat sebuah sistem deteksi penyakit diabetes menggunakan fiture selection korelasi pearson dan klasifikasi naïve bayes yang diharapkan dapat membantu ahli medis dalam deteksi penyakit diabetes secara lebih cepat dan mengurangi beban biaya yang timbul akibat masalah ini. Sistem prediksi berbasis web akan menangani data diabetes berjumlah 768 baris data dengan menampilkan 10 fiture yaitu Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, Age dan Outcome yang diperoleh pada Pima Indians Diabetes kaggle dataset. Penggunaan algoritma korelasi pearson dibutuhkan untuk meningkatkan performa dari algoritma naïve bayes dengan nilai akurasi dari 68,2%  menjadi 79,13% pada data penyakit diabetes.

    Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining

    Get PDF
    OBJECTIVES: The aim of this study was to examine a predictive model using features related to the diabetes type 2 risk factors. METHODS: The data were obtained from a database in a diabetes control system in Tabriz, Iran. The data included all people referred for diabetes screening between 2009 and 2011. The features considered as "Inputs" were: age, sex, systolic and diastolic blood pressure, family history of diabetes, and body mass index (BMI). Moreover, we used diagnosis as "Class". We applied the "Decision Tree" technique and "J48" algorithm in the WEKA (3.6.10 version) software to develop the model. RESULTS: After data preprocessing and preparation, we used 22,398 records for data mining. The model precision to identify patients was 0.717. The age factor was placed in the root node of the tree as a result of higher information gain. The ROC curve indicates the model function in identification of patients and those individuals who are healthy. The curve indicates high capability of the model, especially in identification of the healthy persons. CONCLUSIONS: We developed a model using the decision tree for screening T2DM which did not require laboratory tests for T2DM diagnosis

    Pilot study on developing a decision support tool for guiding re-administration of chemotherapeutic agent after a serious adverse drug reaction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Currently, there are no standard guidelines for recommending re-administration of a chemotherapeutic drug to a patient after a serious adverse drug reaction (ADR) incident. The decision on whether to rechallenge the patient is based on the experience of the clinician and is highly subjective. Thus the aim of this study is to develop a decision support tool to assist clinicians in this decision making process.</p> <p>Methods</p> <p>The inclusion criteria for patients in this study are: (1) had chemotherapy at National Cancer Centre Singapore between 2004 to 2009, (2) suffered from serious ADRs, and (3) were rechallenged. A total of 46 patients fulfilled the inclusion criteria. A genetic algorithm attribute selection method was used to identify clinical predictors for patients' rechallenge status. A Naïve Bayes model was then developed using 35 patients and externally validated using 11 patients.</p> <p>Results</p> <p>Eight patient attributes (age, chemotherapeutic drug, albumin level, red blood cell level, platelet level, abnormal white blood cell level, abnormal alkaline phosphatase level and abnormal alanine aminotransferase level) were identified as clinical predictors for rechallenge status of patients. The Naïve Bayes model had an AUC of 0.767 and was found to be useful for assisting clinical decision making after clinicians had identified a group of patients for rechallenge. A platform independent version and an online version of the model is available to facilitate independent validation of the model.</p> <p>Conclusion</p> <p>Due to the limited size of the validation set, a more extensive validation of the model is necessary before it can be adopted for routine clinical use. Once validated, the model can be used to assist clinicians in deciding whether to rechallenge patients by determining if their initial assessment of rechallenge status of patients is accurate.</p

    An Early Detection Method of Type-2 Diabetes Mellitus in Public Hospital

    Get PDF
    Diabetes is a chronic disease and major problem of morbidity and mortality in developing countries. The International Diabetes Federation estimates that 285 million people around the world have diabetes. This total is expected to rise to 438 million within 20 years. Type-2 diabetes mellitus (T2DM) is the most common type of diabetes and accounts for 90-95% of all diabetes. Detection of T2DM from various factors or symptoms became an issue which was not free from false presumptions accompanied by unpredictable effects. According to this context, data mining and machine learning could be used as an alternative way help us in knowledge discovery from data. We applied several learning methods, such as instance based learners, naive bayes, decision tree, support vector machines, and boosted algorithm acquire information from historical data of patient’s medical records of Mohammad Hoesin public hospital in Southern Sumatera. Rules are extracted from Decision tree to offer decision-making support through early detection of T2DM for clinicians.

    Data Mining: A Novel Outlook to Explore Knowledge in Health and Medical Sciences

    Get PDF
    Today medical and Healthcare industry generate loads of diverse data about patients, disease diagnosis, prognosis, management, hospitals’ resources, electronic patient health records, medical devices and etc. Using the most efficient processing and analyzing method for knowledge extraction is a key point to cost-saving in clinical decision making. Data mining, sometimes called data or knowledge discovery, is the process of analyzing data from different perspectives and summarizing it into useful information. In medicine, this process is distinct from that in other fields, because of heterogeneity and voluminosity of the data. Herein we reviewed some of published articles about application of data mining in several fields in medicine and healthcare

    Model Development for Prediction of Diabetic Retinopathy

    Get PDF
    This research focuses on presenting an empirical method to gather necessary data and then developing several models to predict the chance of diabetic retinopathy (proliferative and non-proliferative) by observing HbA1c, duration of disease and albumin excretion rate of diabetic patients. We gathered required knowledge from other studies that have investigated the relation of different risk factors and complications in diabetes. In order to create 1-1 models, curve fitting was performed by using two different software applications: Tiberius (Brierley 2011) and SPSS (IBM 2010), which work based on ANN and least square regression, respectively. To start producing the model, seven different patterns, i.e. linear, logarithmic, quadratic, cubic, power, s and exponential, have been chosen as the best regression options. Using R-squared, it can be clearly seen that the best selected regression models fit the data in all the dataset tables better than ANN, as well as the other six regression patterns
    corecore