7 research outputs found

    Rekayasa perangkat lunak pada data mining penyakit: Suatu tinjauan literatur sistematis

    Get PDF
    Saat ini sedang terjadi wabah penyakit virus corona yang dideteksi berasal dari Wuhan China dan telah menyebar ke seluruh dunia, telah banyak database tentang penyakit Covid-19 yang bisa digunakan untuk melakukan data mining penyakit. Pada artikel ini melakukan tinjauan literatur secara sistematis untuk memberikan gambaran tentang data mining pada penyakit. Artikel yang dipublikasikan pada tahun 2015 sampai dengan 2020 dari tiga database terpilih (IEEE, ACM, Sciencedirect). Artikel yang ada dianalisis, dan area yang diteliti tentang rekayasa perangkat lunak untuk data mining penyakit. Metode yang digunakan dalam penelitian ini adalah tinjauan literatur sistematis. Berdasarkan temuan kajian literatur data mining penyakit terdapat banyak ragam penyakit yang diteliti, penyakit yang banyak diteliti yaitu tentang penyakit jantung, serta metode data mining yang banyak digunakan adalah Naive Bayes sedangkan akurasi metode data mining yang paling tinggi yaitu Artificial Neural Networks yang diterapkan pada penyakit Talasemia yaitu sebesar 99,73%, sedangkan negara yang paling banyak melakukan penelitian data mining penyakit yaitu India dan Turki

    Heart Disease Prediction using Different Machine Learning Algorithms

    Get PDF
    Identifying a person's potential for developing heart disease is one of the most challenging tasks medical professionals faces today. With nearly one death from heart disease every minute, it is the leading cause of death in the modern era [4]. The database is taken from Kaggle. Various machine learning algorithms are used for prediction of heart disease detection here are Random Forest, XG-Boost, K- Nearest Neighbors (KNN), Logistic Regression, Support Vector Machines (SVM). All these algorithms are implemented using Python programming with Google collab.  The performance evaluation parameters used here are Accuracy, precision, recall and Fi-score. Training and testing are implemented for different ratios such as 60:40, 70:30 and 80:20. From the analysis and comparisons of evaluation parameters of all the above algorithms, XG-Boost is having the highest accuracy and recall value. KNN having worst accuracy and recall amongst all. XG-Boost is having a training accuracy of 98.86, 98.74 and 97.68 for training and testing ratio of 60:40, 70:30 and 80:20 respectively. XG-Boost is having a testing accuracy of 95.85, 95.45 and 96.09 for training and testing ratio of 60:40, 70:30 and 80:20 respectively. So, XG-Boost algorithm can be used for obtaining the best prediction for heart disease.  This type of heart disease prediction can be used as a secondary diagnostic tool for doctors, for best and fast prediction. This can help the early prediction of heart disease thus increasing the chances of the saving the life heart patient

    A Comparison Analysis of Machine Learning Algorithms on Cardiovascular Disease Prediction

    Get PDF
    People nowadays are engrossed in their daily routines, concentrating on their jobs and other responsibilities while ignoring their health. Because of their hurried lifestyles and disregard for their health, the number of people becoming ill grows daily. Furthermore, most of the population suffers from a disease such as cardiovascular disease. Cardiovascular disease kills 35% of the world's population, according to W.H.O. A person's life can be saved if a heart disease diagnosis is made early enough. Still, it can also be lost if the diagnosis is constructed incorrectly. Therefore, predicting heart disease will become increasingly relevant in the medical sector. The volume of data collected by the medical industry or hospitals, on the other hand, can be overwhelming at times. Time-series forecasting and processing using machine learning algorithms can help healthcare practitioners become more efficient. In this study, we discussed heart disease and its risk factors and machine learning techniques and compared various heart disease prediction algorithms. Predicting and assessing heart problems is the goal of this research

    Real Coded Binary Artificial Bee Colony (RC-BABC) Based Feature Selection and Relieff Based Feature Extraction Techniques for Heart Disease Prediction

    Get PDF
    Diagnosing heart disease is really a challenging task for which several intelligent diagnostic systems were developed for enhancing the performance of diagnosing heart disease. However, in these systems, low accuracy of predicting heart disease is still a challenging task. To provide better accuracy in predicting heart risks, a novel feature selection approach is proposed which employs Real Coded Binary Artificial Bee Colony (RC-BABC) optimization algorithm with adaptive size for feature elimination. This method has the advantages of reducing algorithmic computational time, improving prediction accuracy, enhanced data quality, and saves resources in successive data collection phases. Once the features are selected, the important feature extraction phase uses ReliefF based feature extraction method to extract the features from the heart disease data set. The scores of features are computed by estimating a comparison of feature values and class values neighbor samples. The proposed Real Coded Binary Artificial Bee Colony (RC-BABC) optimization algorithm is compared with three well known methods namely an artificial neural network (ANN), K-means clustering approach and Classification and Regression Algorithm (C&RT) with measures like accuracy, precision, recall and F1-score. The proposed method achieved 96.77% of accuracy,98.8% of recall, 97.8% of precision and 98.34% of F1-score

    A new and automated risk prediction of coronary artery disease using clinical endpoints and medical imaging-derived patient-specific insights: protocol for the retrospective GeoCAD cohort study

    Full text link
    INTRODUCTION: Coronary artery disease (CAD) is the leading cause of death worldwide. More than a quarter of cardiovascular events are unexplained by current absolute cardiovascular disease risk calculators, and individuals without clinical risk factors have been shown to have worse outcomes. The 'anatomy of risk' hypothesis recognises that adverse anatomical features of coronary arteries enhance atherogenic haemodynamics, which in turn mediate the localisation and progression of plaques. We propose a new risk prediction method predicated on CT coronary angiography (CTCA) data and state-of-the-art machine learning methods based on a better understanding of anatomical risk for CAD. This may open new pathways in the early implementation of personalised preventive therapies in susceptible individuals as a potential key in addressing the growing burden of CAD. METHODS AND ANALYSIS: GeoCAD is a retrospective cohort study in 1000 adult patients who have undergone CTCA for investigation of suspected CAD. It is a proof-of-concept study to test the hypothesis that advanced image-derived patient-specific data can accurately predict long-term cardiovascular events. The objectives are to (1) profile CTCA images with respect to variations in anatomical shape and associated haemodynamic risk expressing, at least in part, an individual's CAD risk, (2) develop a machine-learning algorithm for the rapid assessment of anatomical risk directly from unprocessed CTCA images and (3) to build a novel CAD risk model combining traditional risk factors with these novel anatomical biomarkers to provide a higher accuracy CAD risk prediction tool. ETHICS AND DISSEMINATION: The study protocol has been approved by the St Vincent's Hospital Human Research Ethics Committee, Sydney-2020/ETH02127 and the NSW Population and Health Service Research Ethics Committee-2021/ETH00990. The project outcomes will be published in peer-reviewed and biomedical journals, scientific conferences and as a higher degree research thesis

    SISTEM KLASIFIKASI RISIKO PENYAKIT JANTUNG MENGGUNAKAN ALGORITMA C4.5 DENGAN PENDEKATAN SMOTE

    Get PDF
    Penyakit jantung merupakan penyebab utama kematian di dunia. Organisasi Kesehatan Dunia (WHO), menyatakan setiap tahun lebih dari 17,9 juta orang di dunia meninggal akibat penyakit jantung dan pembuluh darah. Data Riset Kesehatan Dasar (Riskesdas) tahun 2018, prevalensi penyakit jantung di Indonesia sebesar 1,5% artinya 15 dari 1.000 orang Indonesia menderita penyakit jantung. Algoritma C4.5 merupakan salah satu metode data mining yang dapat diterapkan untuk melakukan klasifikasi risiko penyakit jantung. Dataset dalam penelitian ini diperoleh dari situs repositori UCI Machine Learning, dimana dataset tersebut memiliki 918 record dan 12 artibut. Atribut tersebut mencakup Age, Sex, Cp, Trestbps, Chol, Fbs, Restecg, Thalach, Exang, Oldpeak, Slope, kelas. Klasifikasi penyakit jantung menggunakan pendekatan Synthetic Minority Over-sampling Technique. SMOTE berkerja dengan mensintesis sampel baru dari kelas minoritas untuk menyeimbangkan dataset dengan cara sampling ulang sampel kelas minoritas. Klasifikasi risiko penyakit jantung berbasis web dengan implementasi bahasa pemograman PHP diharapkan mampu membantu masyarakat dalam melakukan pengecekan dini mereka yang berisiko tinggi mengidap penyakit jantung sehingga mereka dapat megetahui risiko penyakit yang diderita dan mengantisipasi penyakit tersebut dengan melakukan tidakan preventif. Output sistem ini adalah klasifikasi risiko penyakit jantung dari serta rekomendasi penanganan. Sistem diuji dengan blackbox test, dan tes akurasi menggunakan confusion matrix diperoleh akurasi terbesar dengan rasio 90:10 sebesar 81,37%. Peningkatan menggunakan pendekatan SMOTE adalah sebesar 3,92% menjadi 85,29%
    corecore