4 research outputs found

    Extracting Rules for Diagnosis of Diabetes Using Genetic Programming

    Get PDF
    Background: Diabetes is a global health challenge that cusses high incidence of major social and economic consequences. As such, early prevention or identification of those people at risk is crucial for reducing the problems caused by it. The aim of study was to extract the rules for diabetes diagnosing using genetic programming. Methods: This study utilized the PIMA dataset of the University of California, Irvine. This dataset consists of the information of 768 Pima heritage women, including 500 healthy persons and 268 persons with diabetes. Regarding the missing values and outliers in this dataset, the K-nearest neighbor and k-means methods are applied respectively. Moreover, a genetic programming model (GP) was conducted to diagnose diabetes as well as to determine the most important factors affecting it. Accuracy, sensitivity and specificity of the proposed model on the PIMA dataset were obtained as 79.32, 58.96 and 90.74%, respectively. Results: The experimental results of our model on PIMA revealed that age, PG concentration, BMI, Tri Fold Thick and Serum Ins were effective in diabetes mellitus and increased risk of diabetes. In addition, the good performance of the model coupled with the simplicity and comprehensiveness of the extracted rules is also shown by the experimental results. Conclusions: GPs can effectively implement the rules for diagnosing diabetes. Both BMI and PG Concentration are also the most important factors to increase the risk of suffering from diabetes. Keywords: Diabetes, PIMA, Genetic programming, KNNi, K-means, Missing value, Outlier detection, Rule extraction

    Extracting Rules for Diagnosis of Diabetes Using Genetic Programming

    Get PDF
    Background: Diabetes is a global health challenge that cusses high incidence of major social and economic consequences. As such, early prevention or identification of those people at risk is crucial for reducing the problems caused by it. The aim of study was to extract the rules for diabetes diagnosing using genetic programming. Methods: This study utilized the PIMA dataset of the University of California, Irvine. This dataset consists of the information of 768 Pima heritage women, including 500 healthy persons and 268 persons with diabetes. Regarding the missing values and outliers in this dataset, the K-nearest neighbor and k-means methods are applied respectively. Moreover, a genetic programming model (GP) was conducted to diagnose diabetes as well as to determine the most important factors affecting it. Accuracy, sensitivity and specificity of the proposed model on the PIMA dataset were obtained as 79.32, 58.96 and 90.74%, respectively. Results: The experimental results of our model on PIMA revealed that age, PG concentration, BMI, Tri Fold Thick and Serum Ins were effective in diabetes mellitus and increased risk of diabetes. In addition, the good performance of the model coupled with the simplicity and comprehensiveness of the extracted rules is also shown by the experimental results. Conclusions: GPs can effectively implement the rules for diagnosing diabetes. Both BMI and PG Concentration are also the most important factors to increase the risk of suffering from diabetes. Keywords: Diabetes, PIMA, Genetic programming, KNNi, K-means, Missing value, Outlier detection, Rule extraction

    Predicting the status of COVID-19 active cases using a neural network time series

    Get PDF
    The design of intelligent systems for analyzing information and predicting the epidemiological trends of the disease is rapidly expanding because of the coronavirus disease (COVID-19) pandemic. The COVID-19 datasets provided by Johns Hopkins University were included in the analysis. This dataset contains some missing data that is imputed using the multi-objective particle swarm optimization method. A time series model based on nonlinear autoregressive exogenou (NARX) neural network is proposed to predict the recovered and death COVID-19 cases. This model is trained and evaluated for two modes: predicting the situation of the affected areas for the next day and the next month. After training the model based on the data from January 22 to February 27, 2020, the performance of the proposed model was evaluated in predicting the situation of the areas in the coming two weeks. The error rate was less than 5%. The prediction of the proposed model for April 9, 2020, was compared with the actual data for that day. The absolute percentage error (AE) worldwide was 12%. The lowest mean absolute error (MAE) of the model was for South America and Australia with 3 and 3.3, respectively. In this paper, we have shown that geographical areas with mortality and recovery of COVID-19 cases can be predicted using a neural network-based model

    Predicting the incidence of COVID-19 using data mining

    No full text
    Abstract Background The high prevalence of COVID-19 has made it a new pandemic. Predicting both its prevalence and incidence throughout the world is crucial to help health professionals make key decisions. In this study, we aim to predict the incidence of COVID-19 within a two-week period to better manage the disease. Methods The COVID-19 datasets provided by Johns Hopkins University, contain information on COVID-19 cases in different geographic regions since January 22, 2020 and are updated daily. Data from 252 such regions were analyzed as of March 29, 2020, with 17,136 records and 4 variables, namely latitude, longitude, date, and records. In order to design the incidence pattern for each geographic region, the information was utilized on the region and its neighboring areas gathered 2 weeks prior to the designing. Then, a model was developed to predict the incidence rate for the coming 2 weeks via a Least-Square Boosting Classification algorithm. Results The model was presented for three groups based on the incidence rate: less than 200, between 200 and 1000, and above 1000. The mean absolute error of model evaluation were 4.71, 8.54, and 6.13%, respectively. Also, comparing the forecast results with the actual values in the period in question showed that the proposed model predicted the number of globally confirmed cases of COVID-19 with a very high accuracy of 98.45%. Conclusion Using data from different geographical regions within a country and discovering the pattern of prevalence in a region and its neighboring areas, our boosting-based model was able to accurately predict the incidence of COVID-19 within a two-week period
    corecore