71 research outputs found
Hybrid Modeling KMeans – Genetic Algorithms in the Health Care Data
K-Means is one of the major algorithms widely used in clustering due to its good computational performance. However, K-Means is very sensitive to the initially selected points which randomly selected, and therefore it does not always generate optimum solutions. Genetic algorithm approach can be applied to solve this problem. In this research we examine the potential of applying hybrid GA- KMeans with focus on the area of health care data. We proposed a new technique using hybrid method combining KMeans Clustering and Genetic Algorithms, called the “Hybrid K-Means Genetic Algorithms†(HKGA). HKGA combines the power of Genetic Algorithms and the efficiency of K-Means Clustering. We compare our results with other conventional algorithms and also with other published research as well. Our results demonstrate that the HKGA achieves very good results and in some cases superior to other methods.Keywords: Machine Learning, K-Means, Genetic Algorithms, Hybrid KMeans Genetic Algorithm (HGKA)
Breast Cancer Diagnosis and Survival Prediction Using JNN
Abstract: Breast cancer is reported to be the most common cancer type among women worldwide and it is the second highest women fatality rate amongst all cancer types. Notwithstanding all the progresses made in prevention and early intervention, early prognosis and survival prediction rates are still not sufficient. In this paper, we propose an ANN model which outperforms all the previous supervised learning methods by reaching 99.57 in terms of accuracy in Wisconsin Breast Cancer dataset. Experimental results on Haberman’s Breast Cancer Survival dataset show the superiority of proposed method by reaching 88.24 % in terms of accuracy. The results are the best reported ones obtained from Artificial Neural Network using JNN environment without any preprocessing of the dataset
Comparing random forest and support vector machines for breast cancer classification
There are more than 100 types of cancer around the world with different symptoms and difficulty in predicting itsappearance in a person due to its random and sudden attack method. However, the appearance of cancer is generally marked by the growth of some abnormal cell. Someone might be diagnosed early and quickly treated, but the cancerous cell most times hides in the body of its victim and reappear, only to kill its sufferer. One of the most common cancers is breast cancer. According to Ministry of Health, in 2018, breast cancer attacked 42 out of every 100.000 people in Indonesia with approximately 17 deaths. In addition, the Ministry recorded a yearly increase in cancer patients. Therefore, there is adequate need to be able to determine those affected by this disease. This study applied the Boruta feature selection to determine the most important features in making a machine learning model. Furthermore, the Random Forest (RF) and Support Vector Machines (SVM) were the machine learning model used, with highest accuracies of 90% and 95% respectively. From the results obtained, the SVM is a better model than random forest in terms of accuracy
Improving breast cancer prediction using a pattern recognition network with optimal feature subsets
Aim To predict the presence of breast cancer by using a
pattern recognition network with optimal features based
on routine blood analysis parameters and anthropometric data.
Methods Sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC), and Fowlkes-Mallows (FM) index of each model were calculated. Glucose, insulin, age,
homeostatic model assessment, leptin, body mass index
(BMI), resistin, adiponectin, and monocyte chemoattractant protein-1 were used as predictors.
Results Pattern recognition network distinguished patients with breast cancer disease from healthy people. The
best classification performance was obtained by using BMI,
age, glucose, resistin, and adiponectin, and in a model with
two hidden layers with 11 and 100 neurons in the neural
network. The accuracy, sensitivity, specificity, FM index, and
MCC values of the best model were 94.1%, 100%, 88.9%,
94.3%, and 88.9%, respectively.
Conclusion Breast cancer diagnosis was successfully predicted using only five features. A model using a pattern
recognition network with optimal feature subsets proposed in this study could be used to improve the early detection of breast cancer
Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data
Accurate diagnostic detection of the
cancerous cells in a patient is critical and may alter the
subsequent treatment and increase the chances of
survival rate. Machine learning techniques have been
instrumental in disease detection and are currently
being used in various classification problems due to
their accurate prediction performance. Various
techniques may provide different desired accuracies and
it is therefore imperative to use the most suitable method
which provides the best desired results. This research
seeks to provide comparative analysis of Support Vector
Machine, Bayesian classifier and other Artificial neural
network classifiers (Backpropagation, linear
programming, Learning vector quantization, and K
nearest neighborhood) on the Wisconsin breast cancer
classification problem
Books’ Rating Prediction Using Just Neural Network
Abstract: The aim behind analyzing the Goodreads dataset is to get a fair idea about the relationships between the multiple attributes a book might have, such as: the aggregate rating of each book, the trend of the authors over the years and books with numerous languages. With over a hundred thousand ratings, there are books which just tend to become popular as each day seems to pass. We proposed an Artificial Neural Network (ANN) model for predicting the overall rating of books. The prediction is based on these features (bookID, title, authors, isbn, language_code, isbn13, # num_pages, ratings_count, text_reviews_count), which were used as input variables and (average_rating) as output variable for our ANN model. Our model were created, trained, and validated using data set in JNN environment, which its title is “Goodreads-books”. Model evaluation showed that the ANN model is able to predict correctly 99.78% of the validation samples
- …