71 research outputs found

    Hybrid Modeling KMeans – Genetic Algorithms in the Health Care Data

    Get PDF
    K-Means is one of the major algorithms widely used in clustering due to its good computational performance. However, K-Means is very sensitive to the initially selected points which randomly selected, and therefore it does not always generate optimum solutions. Genetic algorithm approach can be applied to solve this problem. In this research we examine the potential of applying hybrid GA- KMeans with focus on the area of health care data. We proposed a new technique using hybrid method combining KMeans Clustering and Genetic Algorithms, called the “Hybrid K-Means Genetic Algorithms†(HKGA). HKGA combines the power of Genetic Algorithms and the efficiency of K-Means Clustering. We compare our results with other conventional algorithms and also with other published research as well. Our results demonstrate that the HKGA achieves very good results and in some cases superior to other methods.Keywords: Machine Learning, K-Means, Genetic Algorithms, Hybrid KMeans Genetic Algorithm (HGKA)

    Breast Cancer Diagnosis and Survival Prediction Using JNN

    Get PDF
    Abstract: Breast cancer is reported to be the most common cancer type among women worldwide and it is the second highest women fatality rate amongst all cancer types. Notwithstanding all the progresses made in prevention and early intervention, early prognosis and survival prediction rates are still not sufficient. In this paper, we propose an ANN model which outperforms all the previous supervised learning methods by reaching 99.57 in terms of accuracy in Wisconsin Breast Cancer dataset. Experimental results on Haberman’s Breast Cancer Survival dataset show the superiority of proposed method by reaching 88.24 % in terms of accuracy. The results are the best reported ones obtained from Artificial Neural Network using JNN environment without any preprocessing of the dataset

    Comparing random forest and support vector machines for breast cancer classification

    Get PDF
    There are more than 100 types of cancer around the world with different symptoms and difficulty in predicting itsappearance in a person due to its random and sudden attack method. However, the appearance of cancer is generally marked by the growth of some abnormal cell. Someone might be diagnosed early and quickly treated, but the cancerous cell most times hides in the body of its victim and reappear, only to kill its sufferer. One of the most common cancers is breast cancer. According to Ministry of Health, in 2018, breast cancer attacked 42 out of every 100.000 people in Indonesia with approximately 17 deaths. In addition, the Ministry recorded a yearly increase in cancer patients. Therefore, there is adequate need to be able to determine those affected by this disease. This study applied the Boruta feature selection to determine the most important features in making a machine learning model. Furthermore, the Random Forest (RF) and Support Vector Machines (SVM) were the machine learning model used, with highest accuracies of 90% and 95% respectively. From the results obtained, the SVM is a better model than random forest in terms of accuracy

    Improving breast cancer prediction using a pattern recognition network with optimal feature subsets

    Get PDF
    Aim To predict the presence of breast cancer by using a pattern recognition network with optimal features based on routine blood analysis parameters and anthropometric data. Methods Sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC), and Fowlkes-Mallows (FM) index of each model were calculated. Glucose, insulin, age, homeostatic model assessment, leptin, body mass index (BMI), resistin, adiponectin, and monocyte chemoattractant protein-1 were used as predictors. Results Pattern recognition network distinguished patients with breast cancer disease from healthy people. The best classification performance was obtained by using BMI, age, glucose, resistin, and adiponectin, and in a model with two hidden layers with 11 and 100 neurons in the neural network. The accuracy, sensitivity, specificity, FM index, and MCC values of the best model were 94.1%, 100%, 88.9%, 94.3%, and 88.9%, respectively. Conclusion Breast cancer diagnosis was successfully predicted using only five features. A model using a pattern recognition network with optimal feature subsets proposed in this study could be used to improve the early detection of breast cancer

    Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Get PDF
    Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood) on the Wisconsin breast cancer classification problem

    Books’ Rating Prediction Using Just Neural Network

    Get PDF
    Abstract: The aim behind analyzing the Goodreads dataset is to get a fair idea about the relationships between the multiple attributes a book might have, such as: the aggregate rating of each book, the trend of the authors over the years and books with numerous languages. With over a hundred thousand ratings, there are books which just tend to become popular as each day seems to pass. We proposed an Artificial Neural Network (ANN) model for predicting the overall rating of books. The prediction is based on these features (bookID, title, authors, isbn, language_code, isbn13, # num_pages, ratings_count, text_reviews_count), which were used as input variables and (average_rating) as output variable for our ANN model. Our model were created, trained, and validated using data set in JNN environment, which its title is “Goodreads-books”. Model evaluation showed that the ANN model is able to predict correctly 99.78% of the validation samples
    • …
    corecore