14 research outputs found

    Rekayasa perangkat lunak pada data mining penyakit: Suatu tinjauan literatur sistematis

    Get PDF
    Saat ini sedang terjadi wabah penyakit virus corona yang dideteksi berasal dari Wuhan China dan telah menyebar ke seluruh dunia, telah banyak database tentang penyakit Covid-19 yang bisa digunakan untuk melakukan data mining penyakit. Pada artikel ini melakukan tinjauan literatur secara sistematis untuk memberikan gambaran tentang data mining pada penyakit. Artikel yang dipublikasikan pada tahun 2015 sampai dengan 2020 dari tiga database terpilih (IEEE, ACM, Sciencedirect). Artikel yang ada dianalisis, dan area yang diteliti tentang rekayasa perangkat lunak untuk data mining penyakit. Metode yang digunakan dalam penelitian ini adalah tinjauan literatur sistematis. Berdasarkan temuan kajian literatur data mining penyakit terdapat banyak ragam penyakit yang diteliti, penyakit yang banyak diteliti yaitu tentang penyakit jantung, serta metode data mining yang banyak digunakan adalah Naive Bayes sedangkan akurasi metode data mining yang paling tinggi yaitu Artificial Neural Networks yang diterapkan pada penyakit Talasemia yaitu sebesar 99,73%, sedangkan negara yang paling banyak melakukan penelitian data mining penyakit yaitu India dan Turki

    Structured and unstructured data integration with electronic medical records

    Get PDF
    Medicine is a field with high volatility of changes. Everyday new discoveries and procedures are tested with the sole goal of providing a better-quality life to patients. With the evolution of computer science, multiple fields saw an increase of productivity and solutions that could be implemented. More specifically, in medicine new techniques started being tested in order to understand how the systems and practices used can reach higher performances, while maintaining the predefined high standards of quality. For many years data generated in hospital was collected and stored yet few tools were implemented to extract knowledge or any type of advantage. One of the areas that successfully implemented in medicine was the usage of data processing tools and techniques to further extract information regarding the high abundance of data generated in a daily basis, in this field of work. This data can be stored in different ways which leads to multiple approaches on how to deals with it. The sole purpose of this paper is to give an overview of some case studies where structured and unstructured data was used, joint and separately and the value of it.info:eu-repo/semantics/publishedVersio

    Improvement of alzheimer disease diagnosis accuracy using ensemble methods

    Get PDF
    Nowadays, there is a significant increase in the medical data that we should take advantage of that. The application of the machine learning via the data mining processes, such as data classification depends on using a single classification algorithm or those complained as ensemble models. The objective of this work is to improve the classification accuracy of previous results for Alzheimer disease diagnosing. The Decision Tree algorithm with three types of ensemble methods combined, which are Boosting, Bagging and Stacking. The clinical dataset from the Open Access Series of Imaging Studies (OASIS) was used in the experiments. The experimental results of the proposed approach were better than the previous work results. Where the Random Forest (Bagging) achieved the highest accuracy among all algorithms with 90.69%, while the lowest one was Stacking with 79.07%. All these results generated in this paper are higher in accuracy than that done before

    Model Prediksi Otomatis Jenis Penyakit Hipertensi dengan Pemanfaatan Algoritma Machine Learning Artificial Neural Network

    Get PDF
    Hipertensi merupakan faktor utama dalam perkembangan penyakit seperti stroke, gagal jantung, infark miokard, fibrilasi atrium, penyakit arteri perifer, dan diseksi aorta. Prediksi dini jenis hipertensi dari riwayat kesehatan merupakan hal yang penting agar kita dapat mengetahui penyakit yang disebabkan olehnya. Prediksi ini dapat diperoleh dengan memanfaatkan teknologi machine learning untuk menemukan pengetahuan baru dari data dasar sehingga menemukan pola yang valid, berguna, dan mudah dipelajari. Model klasifikasi neural network diusulkan dalam penelitian ini. Kontribusi kami dalam penelitian ini adalah membuat model klasifikasi neural network. Kami melihat peneliti sebelumnya hanya mengejar nilai akurasi yang tinggi semata. Berbeda dengan penelitian sebelumnya, kami menggunakan teknik optimasi hyperparameter gridsearch cv pada model klasifikasi artificial neural network. Parameter yang digunakan dalam model ini yaitu solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(5, 2), random_state=1. Nilai akurasi ketepatan prediksi dalam menentukan jenis hipertensi ini sebesar 85% yang menunjukan bahwa model yang dibangun tenyata sudah cukup baik dalam proses klasifikasiHipertensi merupakan faktor utama dalam perkembangan penyakit seperti stroke, gagal jantung, infark miokard, fibrilasi atrium, penyakit arteri perifer, dan diseksi aorta. Prediksi dini jenis hipertensi dari riwayat kesehatan merupakan hal yang penting agar kita dapat mengetahui penyakit yang disebabkan olehnya. Prediksi ini dapat diperoleh dengan memanfaatkan teknologi machine learning untuk menemukan pengetahuan baru dari data dasar sehingga menemukan pola yang valid, berguna, dan mudah dipelajari. Model klasifikasi neural network diusulkan dalam penelitian ini. Kontribusi kami dalam penelitian ini adalah membuat model klasifikasi neural network. Kami melihat peneliti sebelumnya hanya mengejar nilai akurasi yang tinggi semata. Berbeda dengan penelitian sebelumnya, kami menggunakan teknik optimasi hyperparameter gridsearch cv pada model klasifikasi artificial neural network. Parameter yang digunakan dalam model ini yaitu solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(5, 2), random_state=1. Nilai akurasi ketepatan prediksi dalam menentukan jenis hipertensi ini sebesar 85% yang menunjukan bahwa model yang dibangun tenyata sudah cukup baik dalam proses klasifikas

    Implementasi Algoritma Random Forest Untuk Menentukan Penerima Bantuan Raskin

    Get PDF
    Kemiskinan adalah salah satu perhatian mendasar dari setiap pemerintah. Program Beras Keluarga Miskin (Raskin) merupakan  salah satu program pemerintah. Skema raskin mempunyai tujuan meminimalisir beban rumah tangga tidak mampu sebagai bentuk bantuan untuk menaikkan ketahanan pangan melalui perlindungan sosial. Tujuan penelitian ini adalah menemukan akurasi tertinggi di antara algoritma klasifikasi prediktif yang diusulkan penerima bantuan raskin menggunakan tools python machine learning dan di implementasikan melalui suatu website. Klasifikasi adalah metode penambangan data yang menentukan kategori pada kelompok data untuk mendukung prediksi dan analisa yang semakin akurat. Beberapa algoritma klasifikasi pembelajaran mesin seperti, SVM, NB dan RF, digunakan pada penelitian ini demi menentukan penerima bantuan raskin. Eksperimen dilakukan menggunakan dataset Raskin Kelurahan Gunungparang, Kota Sukabumi yang bersumber dari Kelurahan Gunungparang. Kinerja algoritma klasifikasi dievaluasi dengan beragam metrik seperti Precision, Accuracy, F-Measure, dan Recall. Akurasi diukur melalui contoh yang dikelompokan dengan benar atau salah. Hasil yang diperoleh menunjukkan algoritma klasifikasi RF memiliki nilai precision, recall, f-measure dengan nilai 97%, nilai accuracy sebesar  97,26% dan nilai ROC 0,970, lebih baik dari algoritma klasifikasi lainnya yaitu perbedaan sebesar 5,11% dengan algoritma klasifikasi support vector machine dan 8,87% dengan algoritma klasifikasi naive bayes. Akurasi sangat baik digunakan sebagai acuan kinerja algoritma apabila jumlah False Negative dan False Positive jumlah nya mendekati. Hasil penelitian ini dibuktikan secara akurat dan sistematis menggunakan Receiver Operating Characteristic (ROC). Abstract The problem of poverty is one of the fundamental concerns of every government. The Raskin  program is one of the government's programs. The Raskin scheme has the aim of minimizing the burden on poor households in the form of assistance to improve food security by providing social protection. The purpose of this study is to find the highest accuracy among the predictive classification algorithms proposed by Raskin beneficiaries using python machine learning tools and implemented through a website. Classification is a data mining method that determines categories in data groups to support more accurate predictions and analysis. Therefore, three machine learning classification algorithms such as, support vector machine, naive bayes and random forest, were used in this experiment. to determine recipients of Raskin assistance. The experiment was carried out using the Raskin dataset, Gunungparang Village, Sukabumi City, which was sourced from Gunungparang Village. The performance of the classification algorithm is evaluated by various metrics such as Precision, Accuracy, F-Measure, and Recall. Accuracy is measured by correctly and incorrectly grouped samples. The results obtained show that the random forest classification algorithm has precision, recall, f-measure values with a value of 97%, an accuracy value of 97.26% and an ROC value of 0.970, better than other classification algorithms, namely the difference of 5.11% with the support vector classification algorithm. machine and 8.87% with naive bayes classification algorithm. Very good accuracy is used as a reference for algorithm performance if the number of False Negatives and False Positives is close. These results were proven accurately and systematically using Receiver Operating Characteristics (ROC)

    LASSO Regression Modeling on Prediction of Medical Terms among Seafarers' Health Documents Using Tidy Text Mining

    Get PDF
    Generally, seafarers face a higher risk of illnesses and accidents than land workers. In most cases, there are no medical professionals on board seagoing vessels, which makes disease diagnosis even more difficult. When this occurs, onshore doctors may be able to provide medical advice through telemedicine by receiving better symptomatic and clinical details in the health abstracts of seafarers. The adoption of text mining techniques can assist in extracting diagnostic information from clinical texts. We applied lexicon sentimental analysis to explore the automatic labeling of positive and negative healthcare terms to seafarers' text healthcare documents. This was due to the lack of experimental evaluations using computational techniques. In order to classify diseases and their associated symptoms, the LASSO regression algorithm is applied to analyze these text documents. A visualization of symptomatic data frequency for each disease can be achieved by analyzing TF-IDF values. The proposed approach allows for the classification of text documents with 93.8% accuracy by using a machine learning model called LASSO regression. It is possible to classify text documents effectively with tidy text mining libraries. In addition to delivering health assistance, this method can be used to classify diseases and establish health observatories. Knowledge developed in the present work will be applied to establish an Epidemiological Observatory of Seafarers' Pathologies and Injuries. This Observatory will be a collaborative initiative of the Italian Ministry of Health, University of Camerino, and International Radio Medical Centre (C.I.R.M.), the Italian TMAS

    Bayesian networks for disease diagnosis: What are they, who has used them and how?

    Full text link
    A Bayesian network (BN) is a probabilistic graph based on Bayes' theorem, used to show dependencies or cause-and-effect relationships between variables. They are widely applied in diagnostic processes since they allow the incorporation of medical knowledge to the model while expressing uncertainty in terms of probability. This systematic review presents the state of the art in the applications of BNs in medicine in general and in the diagnosis and prognosis of diseases in particular. Indexed articles from the last 40 years were included. The studies generally used the typical measures of diagnostic and prognostic accuracy: sensitivity, specificity, accuracy, precision, and the area under the ROC curve. Overall, we found that disease diagnosis and prognosis based on BNs can be successfully used to model complex medical problems that require reasoning under conditions of uncertainty.Comment: 22 pages, 5 figures, 1 table, Student PhD first pape

    Outcome prediction of electroconvulsive therapy for depression

    Get PDF
    Introduction: We developed and tested a Bayesian network(BN) model to predict ECT remission for depression, with non-response as a secondary outcome. Methods: We performed a systematic literature search on clinically available predictors. We combined these predictors with variables from a dataset of clinical ECT trajectories (performed in the University Medical Center Utrecht) to create priors and train the BN. Temporal validation was performed in an independent sample. Results: The systematic literature search yielded three meta-analyses, which provided prior knowledge on outcome predictors. The clinical dataset consisted of 248 treatment trajectories in the training set and 44 trajectories in the test set at the same medical center. The AUC for the primary outcome remission estimated on an independent validation set was 0.686 (95%CI 0.513–0.859) (AUC values of 0.505 – 0.763 observed in 5-fold cross validation of the model within the train set). Accuracy 0.73 (balanced accuracy 0.67), sensitivity 0.55, specificity 0.79, after temporal validation in the independent sample. Prior literature information marginally reduced CI width. Discussion: A BN model comprised of prior knowledge and clinical data can predict remission of depression after ECT with reasonable performance. This approach can be used to make outcome predictions in psychiatry, and offers a methodological framework to weigh additional information, such as patient characteristics, symptoms and biomarkers. In time, it may be used to improve shared decision-making in clinical practice

    Outcome prediction of electroconvulsive therapy for depression

    Get PDF
    Introduction: We developed and tested a Bayesian network(BN) model to predict ECT remission for depression, with non-response as a secondary outcome. Methods: We performed a systematic literature search on clinically available predictors. We combined these predictors with variables from a dataset of clinical ECT trajectories (performed in the University Medical Center Utrecht) to create priors and train the BN. Temporal validation was performed in an independent sample. Results: The systematic literature search yielded three meta-analyses, which provided prior knowledge on outcome predictors. The clinical dataset consisted of 248 treatment trajectories in the training set and 44 trajectories in the test set at the same medical center. The AUC for the primary outcome remission estimated on an independent validation set was 0.686 (95%CI 0.513–0.859) (AUC values of 0.505 – 0.763 observed in 5-fold cross validation of the model within the train set). Accuracy 0.73 (balanced accuracy 0.67), sensitivity 0.55, specificity 0.79, after temporal validation in the independent sample. Prior literature information marginally reduced CI width. Discussion: A BN model comprised of prior knowledge and clinical data can predict remission of depression after ECT with reasonable performance. This approach can be used to make outcome predictions in psychiatry, and offers a methodological framework to weigh additional information, such as patient characteristics, symptoms and biomarkers. In time, it may be used to improve shared decision-making in clinical practice
    corecore