26 research outputs found

    COMPARISON OF DATA MINING CLASSIFICATION METHODS TO DETECT HEART DISEASE

    Get PDF
    Heart disease is a disease that is deadly and must be treated as soon as possible because if it is too late, it has a big risk to one's life. Factors causing the disease of the heart is the use of tobacco, the physical who are less active, and an unhealthy diet. With existing data, the study is to compare the three algorithms, namely: Naive Bayes, Logistic Regression, and Support Vector Machine (SVM) which aims to determine the level of accuracy of the best of the dataset that is used to predict disease heart. This research produces the best accuracy of 87%, which is generated by the Naive Bayes metho

    Rekayasa perangkat lunak pada data mining penyakit: Suatu tinjauan literatur sistematis

    Get PDF
    Saat ini sedang terjadi wabah penyakit virus corona yang dideteksi berasal dari Wuhan China dan telah menyebar ke seluruh dunia, telah banyak database tentang penyakit Covid-19 yang bisa digunakan untuk melakukan data mining penyakit. Pada artikel ini melakukan tinjauan literatur secara sistematis untuk memberikan gambaran tentang data mining pada penyakit. Artikel yang dipublikasikan pada tahun 2015 sampai dengan 2020 dari tiga database terpilih (IEEE, ACM, Sciencedirect). Artikel yang ada dianalisis, dan area yang diteliti tentang rekayasa perangkat lunak untuk data mining penyakit. Metode yang digunakan dalam penelitian ini adalah tinjauan literatur sistematis. Berdasarkan temuan kajian literatur data mining penyakit terdapat banyak ragam penyakit yang diteliti, penyakit yang banyak diteliti yaitu tentang penyakit jantung, serta metode data mining yang banyak digunakan adalah Naive Bayes sedangkan akurasi metode data mining yang paling tinggi yaitu Artificial Neural Networks yang diterapkan pada penyakit Talasemia yaitu sebesar 99,73%, sedangkan negara yang paling banyak melakukan penelitian data mining penyakit yaitu India dan Turki

    Use of Data Mining for The Analysis of Consumer Purchase Patterns with The Fpgrowth Algorithm on Motor Spare Part Sales Transactions Data

    Get PDF
    This study aims to analyze consumer purchasing patterns for motorcycle parts using data mining methods and FP-Growth algorithms on motorcycle parts sales transaction data. This research aims to obtain helpful information for companies in planning marketing strategies and increasing sales. The data used in this study are motorcycle parts sales transaction data from motorcycle parts stores for one year. The data is then processed using the FP-Growth algorithm to find significant purchasing patterns. The results of this study show that the FP-Growth algorithm can be used to identify substantial consumer purchasing patterns. Some purchase patterns found include a combination of often purchased products, the most active purchase time, and the most purchased product category. Using data mining and the FP-Growth algorithm can assist companies in understanding significant consumer purchasing patterns to improve the effectiveness of marketing strategies and increase sales of motorcycle parts. The novelty of this research lies in using data mining methods and FP-Growth algorithms on motorcycle parts sales transaction data to analyze consumer purchasing patterns. This research also provides valuable information for companies in planning marketing strategies and increasing sales by identifying significant consumer purchasing patterns, such as product combinations often purchased together and the most purchased product categories

    Hybrid of K-means clustering and naive Bayes classifier for predicting performance of an employee

    Get PDF
    Predicting the performance of an employee in the future is a requirement for companies to succeed. The employee is the organization's main component, the failure or organization’s success based on the performance of an employee, this has become an important interest in almost all types of companies for decision-makers and managers in the implementation of plans to find highly skilled employees correctly. Management thus becomes involved in the success of these employees. Particularly to guarantee that the right employee at the right time is assigned to the convenient job. The forecasting of analytics is a modern human resource trend. In the field of predictive analytics, data mining plays a useful role. To obtain a highly precise model, the proposed framework incorporates the K-Means clustering approach and the Naïve Bayes (NB) classification for better results in processing performance data of employees, implemented in WEKA, which enables personnel professionals and decision-makers to predict and optimize their employees' performance. The data were taken from the previous works, this was used as a test case to illustrate how the incorporates of K-Media and Naïve Bayes algorithms increases the exactness of employee performance predicting, compared with the K-Means and Naïve Bayes methods, the proposed framework increases the accuracy of predicting the performance of an employee

    Random State Initialized Logistic Regression for Improved Heart Attack Prediction

    Get PDF
    One of the primary causes of death in Indonesia is heart attacks. Therefore, an effective method of pre-diction is required to determine whether a patient is experiencing a heart attack. One efficient approach is to use machine learning models. However, it is still rare to find machine learning models that have good performance in predicting heart attacks. This study aims to develop a machine learning model on Logistic Regression algorithm in predicting heart attack. Logistic Regression is one of the machine learning meth-ods that can be used to study the relationship between a binary response variable [0,1] and a set of pre-dictor variables, and can be used directly to calculate probabilities. In this study, a random state is ini-tialized in the Logistic Regression model in order to stabilize the training of the machine learning model and increase the precision of the proposed method. The results of this study show that the proposed model can be a method that has good performance in predicting heart attack disease

    A Novel Approach for Detecting Outliers by Using Isolation Forest with Reducing Under Fitting Issue

    Get PDF
    The effectiveness of machine learning for a particular activity depends on a variety of parameters. The incident database's description and validity come first and primary. Information retrieval even during the training cycle is more challenging if there is a lot of repetitious, unimportant information or incomplete information available. It is good knowledge that running time for ML tasks is significantly impacted by conditions as follows and sorting stages. To increase the accuracy of any model data cleansing is essential. Without sufficient data scrubbing, no predictive model accuracy can begin. EDA, or exploratory data analysis, is the name of this procedure. In this study, we discussed outlier identification, one of many EDA processes for complete perfect data. In this research, we attempted to use the isolation forest approach to calculate the outlier factor. Then a model known as an outlier finding model is created. The problem of outlier detection leads to a collection of connected supervised learning for binary classification. We carry out in-depth tests on various datasets and demonstrate that in our latest outlier finding technique compare with the old way. Our approach yields superior outcomes in terms of accuracy, precision, recall & F-1 score. Additionally, we successfully lowered the machine learning algorithms' under fitting issue

    Novel Classification and Prediction of Heart Disease using CDMA Algorithm

    Get PDF
    Chronic illness is a long-term condition that lasts a lifetime. In most cases, immunizations and medications cannot heal them, or they do not work.  The most common chronic illnesses are heart disease. The first step in stopping the progression of these disorders is patient diagnosis and prognosis. The identification of individuals with heart disease may be made easier with the machine learning (ML) and deep learning (DL). Finding people who is at risk for these well-known illnesses is often influenced by a variety of circumstances. High precision is provided by deep learning. Machine learning, however, provides less precision. Deep learning also needs a lot of data. However, machine learning can be trained on less data. By doing so, we may determine that one technique's flaw is fixed by another. To classify and forecast heart disease, this research developed an algorithm by combining ML and DL algorithm that is Combination of Machine Learning and Deep Learning Algorithm (CMDA). The data set for the work was taken from UCI data repository. The CMDA algorithm uses the Dl4jMlpClassifier and the Support Vector Machine (SVM). The technique like stacking classifier is used to integrate above two algorithms in the CMDA. The classification method utilized Naive Bayes as a meta-classifier in the CMDA algorithm, uses a stacking classifier strategy for final prediction. After prediction finally, the CMDA method utilizes the Min-Max normalization approach to determine risk factor. According to the experimental findings, the proposed CMDA algorithm effectively classifies and forecasts heart disease and produce high results while comparing with existing methods. &nbsp

    Role of Feature Selection in Building High Performance Heart Disease Prediction Systems

    Get PDF
    In the last few years, there has been a tremendous rise in the number of deaths due to heart diseases all over the world. In low- and middle-income countries, heart diseases are usually not detected in early stages which makes the treatment difficult. Early diagnosis can help significantly in preventing these diseases. Machine learning-based prediction systems offer a cost-effective and efficient way to diagnose these diseases in an early stage. Research is being carried out to increase the performance of these systems. Redundant and irrelevant features in the medical dataset deteriorate the performance of prediction systems. In this paper, an exhaustive study has been done to improve the performance of the prediction systems by applying 4 feature selection algorithms. Experimental results prove that the use of feature selection algorithms provides a substantial increase in accuracy and speed of execution of the prediction system. The prediction system proposed in this study shall prove to be a great help to prevent heart diseases by enabling the medical practitioners to detect heart diseases in early stages
    corecore