4 research outputs found

    Enhancing Breast Cancer Prediction through Deep Learning and Comparative Analysis of Gene Expression and DNA Methylation Data using Convolutional Neural Networks

    Get PDF
    Recent advances in the production of statistics have resulted in an exponential increase in the number of facts, ushering in a whole new era dominated by very large facts. Conventional machine-learning algorithms are unable to handle the most recent aspects of huge data. This is a fact.  In order to make an accurate prognosis of breast cancer, researchers employ and evaluate three distinct computer programmes called Support Vector Machine (SVM), Random Forest (RF), and Decision Tree (DT). Within the context of huge statistics, we explore the question of how breast cancer may be predicted in this particular research. Gene expression and DNA methylation are both taken into consideration as part of the analysis (GE and DM, respectively). The purpose of the work that we are doing is to increase the capacity of the Deep Learning algorithms that are now being used for typing by applying each dataset individually and together. As a result of this decision, the platform of choice is MATLAB. In the process of breast cancer prediction, the Convolutional Neural Network (CNN) algorithm is used. Comparisons of GE, DM, and GE and DM are carried out with the help of this method. The results of the CNN algorithm are compared to those of the RF algorithm. According to findings of the experiments, the scaled system that was presented works better than the other classifiers. This is due to the fact that using the GE dataset; it acquired the best accuracy at the lowest cost

    Optimized Machine Learning Performance with Feature Selection for Breast Cancer Disease Classification

    Get PDF
    The prevalence of breast cancer is relatively high among adults worldwide. Particularly in Indonesia, according to the latest data from the World Health Organization (WHO), breast cancer accounts for 1.41% of all deaths and continues to increase. In order to address this growing issue, a proactive approach becomes essential. Therefore, the objective of this study is to classify the diagnosis of breast cancer into two categories: Benign and Malignant. Moreover, this classification pattern can serve as a benchmark for early detection and is expected to reduce mortality and cancer rates in breast cancer cases. The dataset used in this study is obtained from Kaggle and consists of 569 rows with 32 attributes. Various machine learning algorithms, such as Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Naïve Bayes (NB), are employed for the classification analysis in this disease. . This study uses Principal Component Analysis (PCA) for optimized feature selection techniques with dimension reduction are employed on the dataset prior to modeling the data. Our highest accuracy model is the Support Vector Machine (SVM) with an RBF kernel, utilizing c-value selection. Additionally, the Logistic Regression (LR) model achieves an accuracy of 97.3%. However, it is worth noting that the precision and recall of the SVM model are both 100%. Moreover, the Receiver Operating Characteristic (ROC) curve indicates that the SVM graph surpasses the LR graph, which can be attributed to the results obtained from the confusion matrix calculation, where the False Positive Rate is found to be 0. Consequently, the overall performance evaluation of the SVM model with an RBF kernel, along with the utilization of the c-value selection approach, is significantly superior. This is primarily due to the fact that the SVM model does not make any incorrect predictions by classifying something as positive when it is actually negative

    Optimized Machine Learning Performance with Feature Selection for Breast Cancer Disease Classification

    Get PDF
    The prevalence of breast cancer is relatively high among adults worldwide. Particularly in Indonesia, according to the latest data from the World Health Organization (WHO), breast cancer accounts for 1.41% of all deaths and continues to increase. In order to address this growing issue, a proactive approach becomes essential. Therefore, the objective of this study is to classify the diagnosis of breast cancer into two categories: Benign and Malignant. Moreover, this classification pattern can serve as a benchmark for early detection and is expected to reduce mortality and cancer rates in breast cancer cases. The dataset used in this study is obtained from Kaggle and consists of 569 rows with 32 attributes. Various machine learning algorithms, such as Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Naïve Bayes (NB), are employed for the classification analysis in this disease. . This study uses Principal Component Analysis (PCA) for optimized feature selection techniques with dimension reduction are employed on the dataset prior to modeling the data. Our highest accuracy model is the Support Vector Machine (SVM) with an RBF kernel, utilizing c-value selection. Additionally, the Logistic Regression (LR) model achieves an accuracy of 97.3%. However, it is worth noting that the precision and recall of the SVM model are both 100%. Moreover, the Receiver Operating Characteristic (ROC) curve indicates that the SVM graph surpasses the LR graph, which can be attributed to the results obtained from the confusion matrix calculation, where the False Positive Rate is found to be 0. Consequently, the overall performance evaluation of the SVM model with an RBF kernel, along with the utilization of the c-value selection approach, is significantly superior. This is primarily due to the fact that the SVM model does not make any incorrect predictions by classifying something as positive when it is actually negative

    Analisis bibliometrik penelitian pohon keputusan untuk prediksi kanker payudara

    Get PDF
    Tujuan makalah ini adalah untuk melakukan analisis bibliometrik mengenai publikasi ilmiah yang membahas penggunaan metode decision tree untuk prediksi kanker payudara. Sebanyak 322 dokumen dari Scopus dikumpulkan untuk dianalisis menggunakan indikator bibliometrik seperti produktivitas dan sitasi. Analisis bibliometrik menghasilkan pemetaan sains berdasarkan dengan kata kunci co-occurrence, co-authorship, dan co-citation analysis untuk mencerminkan struktur konseptual, sosial, dan intelektual penelitian. Hasil analisis artikel yang berpengaruh menemukan peningkatan citasi dan jumlah penulis yang eksponensial dalam penelitian ini pada periode 2005-2023, dimana China sebagai negara yang dominan dalam melakukan penelitian. Pada analisis thematic map menghasilkan tiga topik penelitian yaitu bidang kedokteran, bidang komputer dan bidang bioinformatika. Topik penelitian dalam penggunaan metode decision tree untuk prediksi kanker payudara dalah termasuk pada bidang komputer. Penelitian ini menyarankan bahwa penelitian dalam penggunaan metode decision tree untuk prediksi kanker payudara adalah topik penelitian yang perlu terus ditingkatkan
    corecore