3 research outputs found

    Comparison of Support Vector Machine and K-Nearest Neighbors in Breast Cancer Classification

    Get PDF
    Cancer is one of the leading causes of death, and breast cancer is the second leading cause of cancer death in women. One method to realize the level of malignancy of breast cancer from an early age is by classifying the cancer malignancy using data mining. One of the widely used data mining methods with a good level of accuracy is the Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). Evaluation techniques of percentage split and cross-validation were used to evaluate and compare the SVM and KNN classification models. The result was that the accuracy level of the SVM classification method was better than the KNN classification method when using the cross-validation technique, which is 95,7081%. Meanwhile, the KNN classification method was better than the SVM classification method when using the percentage split technique, which is 95,4220%. From the comparison results, it can be seen that the KNN and SVM methods work well in the classification of breast cancer

    Analisis bibliometrik penelitian pohon keputusan untuk prediksi kanker payudara

    Get PDF
    Tujuan makalah ini adalah untuk melakukan analisis bibliometrik mengenai publikasi ilmiah yang membahas penggunaan metode decision tree untuk prediksi kanker payudara. Sebanyak 322 dokumen dari Scopus dikumpulkan untuk dianalisis menggunakan indikator bibliometrik seperti produktivitas dan sitasi. Analisis bibliometrik menghasilkan pemetaan sains berdasarkan dengan kata kunci co-occurrence, co-authorship, dan co-citation analysis untuk mencerminkan struktur konseptual, sosial, dan intelektual penelitian. Hasil analisis artikel yang berpengaruh menemukan peningkatan citasi dan jumlah penulis yang eksponensial dalam penelitian ini pada periode 2005-2023, dimana China sebagai negara yang dominan dalam melakukan penelitian. Pada analisis thematic map menghasilkan tiga topik penelitian yaitu bidang kedokteran, bidang komputer dan bidang bioinformatika. Topik penelitian dalam penggunaan metode decision tree untuk prediksi kanker payudara dalah termasuk pada bidang komputer. Penelitian ini menyarankan bahwa penelitian dalam penggunaan metode decision tree untuk prediksi kanker payudara adalah topik penelitian yang perlu terus ditingkatkan

    Reliable water quality prediction and parametric analysis using explainable AI models

    Get PDF
    The consumption of water constitutes the physical health of most of the living species and hence management of its purity and quality is extremely essential as contaminated water has to potential to create adverse health and environmental consequences. This creates the dire necessity to measure, control and monitor the quality of water. The primary contaminant present in water is Total Dissolved Solids (TDS), which is hard to filter out. There are various substances apart from mere solids such as potassium, sodium, chlorides, lead, nitrate, cadmium, arsenic and other pollutants. The proposed work aims to provide the automation of water quality estimation through Artificial Intelligence and uses Explainable Artificial Intelligence (XAI) for the explanation of the most significant parameters contributing towards the potability of water and the estimation of the impurities. XAI has the transparency and justifiability as a white-box model since the Machine Learning (ML) model is black-box and unable to describe the reasoning behind the ML classification. The proposed work uses various ML models such as Logistic Regression, Support Vector Machine (SVM), Gaussian Naive Bayes, Decision Tree (DT) and Random Forest (RF) to classify whether the water is drinkable. The various representations of XAI such as force plot, test patch, summary plot, dependency plot and decision plot generated in SHAPELY explainer explain the significant features, prediction score, feature importance and justification behind the water quality estimation. The RF classifier is selected for the explanation and yields optimum Accuracy and F1-Score of 0.9999, with Precision and Re-call of 0.9997 and 0.998 respectively. Thus, the work is an exploratory analysis of the estimation and management of water quality with indicators associated with their significance. This work is an emerging research at present with a vision of addressing the water quality for the future as well