320 research outputs found

    Comparison of Adaboost Application to C4.5 and C5.0 Algorithms in Student Graduation Classification

    Get PDF
    Students become a benchmark used to assess quality and evaluate college learning plans. Therefore, students who graduate not on time can have an effect on accreditation assessment. The characteristics of students who graduate on time or not on time in determining student graduation can be analyzed using classification techniques in data mining, namely the C4.5 and C5.0 algorithms. The purpose of this study is to compare the application of the Adaboost Algorithm to the C4.5 and C5.0 Algorithms in the classification of student graduation. The data used is the graduation data of students of the Statistics Study Program at Tanjungpura University Period I of the 2017/2018 Academic Year to Period II of the 2022/2023 Academic Year. The analysis begins by calculating the entropy, gain and gain ratio values. After that, each data was given the same initial weight and iterated 100 times. Based on the classification results using the C5.0 Algorithm, the attribute that has the highest gain ratio value is school accreditation, meaning that the school accreditation attribute has the most influence in the classification of student graduation. The application of the Adaboost Algorithm to the C5.0 Algorithm is better than the C4.5 Algorithm in classifying the graduation of students of the Untan Statistics Study Program. The Adaboost algorithm was able to increase the accuracy of the C5.0 Algorithm by 12.14%. While in the C4.5 Algorithm, the Adaboost Algorithm increases accuracy by 10.71%

    The Utility of Decision Tree and Analytics Hierarchy Process in Prioritizing of Social Aid Distribution due to Covid-19 Pandemic in Indonesia

    Get PDF
    The Indonesian government provided various social assistance programs to local governments during Covid-19. One of the difficulties for the local governments in determining candidates for social aid is ensuring that the number of candidates is in balance with the available quota. Therefore, the local governments must select the most eligible candidates. This study proposes a priority model that can provide recommendations for candidates who meet the criteria for social assistance. The six parameters used in this study were: number of dependents, occupation, income, age, Covid status, and citizen status. The model operates in two stages, namely classification followed by ranking. The classification stage is conducted using a decision tree, while the ranking stage is performed conducted using the Analytical Hierarchy Process (AHP) algorithm. The decision tree separates two classes, namely, eligible and non-eligible. In addition, the classification process is also used to determine the dominant attributes and played a role in the modeling. The proposed model generates a list of the most eligible candidates based on our research. These are sorted by weight from greatest to most eligible using five dominant parameters: number of dependents, income, age, Covid status, and citizen status

    The Utility of Decision Tree and Analytics Hierarchy Process in Prioritizing of Social Aid Distribution due to Covid-19 Pandemic in Indonesia

    Get PDF
    The Indonesian government provided various social assistance programs to local governments during Covid-19. One of the difficulties for the local governments in determining candidates for social aid is ensuring that the number of candidates is in balance with the available quota. Therefore, the local governments must select the most eligible candidates. This study proposes a priority model that can provide recommendations for candidates who meet the criteria for social assistance. The six parameters used in this study were: number of dependents, occupation, income, age, Covid status, and citizen status. The model operates in two stages, namely classification followed by ranking. The classification stage is conducted using a decision tree, while the ranking stage is performed conducted using the Analytical Hierarchy Process (AHP) algorithm. The decision tree separates two classes, namely, eligible and non-eligible. In addition, the classification process is also used to determine the dominant attributes and played a role in the modeling. The proposed model generates a list of the most eligible candidates based on our research. These are sorted by weight from greatest to most eligible using five dominant parameters: number of dependents, income, age, Covid status, and citizen status

    Application of multiobjective genetic programming to the design of robot failure recognition systems

    Get PDF
    We present an evolutionary approach using multiobjective genetic programming (MOGP) to derive optimal feature extraction preprocessing stages for robot failure detection. This data-driven machine learning method is compared both with conventional (nonevolutionary) classifiers and a set of domain-dependent feature extraction methods. We conclude MOGP is an effective and practical design method for failure recognition systems with enhanced recognition accuracy over conventional classifiers, independent of domain knowledge

    The Combination of C4.5 with Particle Swarm Optimization in Classification of Class for Mental Retardation Students

    Get PDF
    Mental retardation or brain weakness is a condition of children who experience mental disorders. There are several characteristics to know the child has mental retardation. When entering a school, teachers are expected to be able to determine the right class for mental retardation students according to their category. Data mining is the process of finding patterns in selected data using artificial intelligence and machine learning. Algorithm C4.5 is one of the classification techniques in data mining. C4.5 can be used to create decision trees and classify data that has numeric, continuous, and categorical attributes. But C4.5 has the disadvantage of reading large amounts of data and cannot rank every alternative. PSO is an optimization algorithm for feature selection that can improve performance in data classification. Therefore, this study proposes an algorithm that can overcome the weaknesses of C4.5 by combining PSO. This study aims to classify a class of new mental retardation students using a combination of C4.5 as a classification and PSO as a feature selection to determine the attributes that affect the level of accuracy. The contribution of this research is to make it easier for the school to determine the new class of mental retardation students so that it is appropriate and according to their needs. The classification process in this study uses a combination of C4.5 and PSO. The validation used in this model is 10-fold cross-validation, and the evaluation uses a confusion matrix. This study resulted in an accuracy of C4.5 before using PSO of 91%. While the accuracy of C4.5 uses a PSO of 93%. Of the 20 attributes, there are 6 attributes that affect the level of accuracy. This study shows that PSO can be used to implement feature selection and increase the accuracy value of C4.5 by 2%

    Selection of Home Wifi Internet: Machine Learning Implementation With Decision Tree C4.5 Algorithm Method

    Get PDF
    The multiple bandwidths that internet service providers offer make it difficult for people to choose, especially for regular people unfamiliar with the internet; therefore, most people choose because the price is reasonable. Numerous users also lament the difficulty and slow internet usage. The issue is then concentrated on internet service providers, who are thought to be poor at offering services. The quantity of bandwidth consumed, which does not correspond to the user’s needs, is one factor contributing to slow internet. As a result, the appropriate bandwidth must be chosen based on the requirements of each user. Compared to other algorithms, the C4.5 decision tree method can deliver the best and correct decision, according to the current literature. As a result, this project will develop a web application based on the C4.5 decision tree algorithm that can assist in determining bandwidth and internet following community needs. Using this C4.5 Decision Tree, decisions are based on patterns identified in previously collected data. Predictions about various forms of internet use in the neighborhood may subsequently be produced from these patterns. Based on the calculation, the accuracy obtained is 0.54, or a percentage of 54%. The black box testing indicated that the bandwidth determination application was functioning correctl

    ANALISA METODE C4.5 UNTUK MENGETAHUI FAKTOR KEPUASAN MAHASISWA TERHADAP PEMBELAJARAN DARING

    Get PDF
    Pada Masa Pandemi Covid-19 Saat ini terjadi perubahan produktivitas, kegiatan masyarakat harus dibatasi guna memutus rantai penyebaran virus covid-19, segela kegiatan non esensial sebisa mungkin dilakukan secara daring, termasuk belajar mengajar pada perguruan tinggi, proses belajar mengajar secara daring perlu dievaluasi agar mahasiswa semangat belajar dan proses transfer ilmu berjalan dengan baik. Tujuan penelitian ini adalah mengevaluasi pembelajaran daring pada perguruan tinggi untuk mengetahui tingkat kepuasan mahasiswa terhadap pembelajaran daring dan mengetahui faktor apa yang menjadi penyebab dari kepuasan pembelajaran daring. Metode yang digunakan pada penelitian ini adalah analisis dan identifikasi masalah, membuat kajian literatur, pengumpulan data dan wawancara dengan pihak terkait, kemudian menerapkan teknik data mining pada data kemudian klasifikasi data dengan algoritma C4.5 Dan kemudian simulasi dengan rapid miner. Data yang digunakan pada penelitian ini adalah hasil kuesioner yang diisi oleh mahasiswa berdasarkan variabel atau kriteria yang dilakukan pada saat pembelajaran daring, dan akan diolah dengan teknik data mining dengan algoritma C4.5, dimana metode ini memiliki akurasi yang tinggi untuk klasifikasi. Hasil penelitian ini adalah diketahui bahwa variabel yang paling berpengaruh pada kepuasan mahasiswa terhadap pembalajaran daring adalah fasilitas interaksi selanjutnya adalah Fasilitas dari objek pembelajaran daring yang dapat memfasilitas kelengkapan objek pembelajaran

    Comparative Analysis of the C4.5 Algorithm and the Nearest Neighbor for the Number of Prospective New Student Registrants

    Get PDF
    In 2015, the number of registrants for new student candidates at Muhammadiyah University of Gorontalo, has increased about 20% - 50% from the last year in 2014, but when it starts from 2017/2018 of the academic year the number of new student candidates who registered was only around 4,713 students for bachelor’s and there is 1,256 students for Bachelor’s Degree, while in the academic year of 2018/2019 bachelor’s degree students were only 765 and bachelor’s students were around 4,187, it is known as a decline from the previous year. This study, aims to help to predict the number of prospective of the new students who will enroll in the following of the academic year by analyzing the comparison of the C4.5 and Nearest Neighbor Algorithms with comparing two of algorithms to get the best results. In the C4.5 and Nearest Neighbor Algorithms, it is necessary to be able to see some patterns from the data about the prospective students, then, they can produce the predictions of the number of prospective students who can help in increasing the number of prospective students that is according to the target achievements of Muhammadiyah University of Gorontalo (UMG) itself

    Classification and Clustering of Internet Quota Sales Data Using C4.5 Algorithm and K-Means

    Get PDF
    The number of restrictions or limits on internet use is known as internet quota. When you use internet data for a short time, you usually run out of bandwidth. In the Candimas South Abung area, many quotas have been sold in various variants. Visitors to quota outlets have access to various kinds of quota references that they can buy. Apart from guaranteeing the quality of the quotas sold, sales always increase every year, especially in the various quota variants. Based on quota data for 2019 to 2022. This study aims to analyze internet quota sales statistics in the Candimas area between 2019 and 2022. In 2021-2022 the classification produces an accuracy of up to 100% where the best-selling data dominates while clustering remains at the same figure, namely 19 data are very salable, 43 data are lacking sold, and 178 data did not sell. We use the C4.5 classification algorithm and K-Means clustering to identify patterns in the data and provide insight into which brand quotas are the most popular. Our findings can help Xena Cell counter owners make informed decisions about which quota to add or remove to optimize sales and minimize losses

    Variational Autoencoder Based Estimation Of Distribution Algorithms And Applications To Individual Based Ecosystem Modeling Using EcoSim

    Get PDF
    Individual based modeling provides a bottom up approach wherein interactions give rise to high-level phenomena in patterns equivalent to those found in nature. This method generates an immense amount of data through artificial simulation and can be made tractable by machine learning where multidimensional data is optimized and transformed. Using individual based modeling platform known as EcoSim, we modeled the abilities of elitist sexual selection and communication of fear. Data received from these experiments was reduced in dimension through use of a novel algorithm proposed by us: Variational Autoencoder based Estimation of Distribution Algorithms with Population Queue and Adaptive Variance Scaling (VAE-EDA-Q AVS). We constructed a novel Estimation of Distribution Algorithm (EDA) by extending generative models known as variational autoencoders (VAE). VAE-EDA-Q, proposed by us, smooths the data generation process using an iteratively updated queue (Q) of populations. Adaptive Variance Scaling (AVS) dynamically updates the variance at which models are sampled based on fitness. The combination of VAE-EDA-Q with AVS demonstrates high computational efficiency and requires few fitness evaluations. We extended VAE-EDA-Q AVS to act as a feature reducing wrapper method in conjunction with C4.5 Decision trees to reduce the dimensionality of data. The relationship between sexual selection, random selection, and speciation is a contested topic. Supporting evidence suggests sexual selection to drive speciation. Opposing evidence contends either a negative or absence of correlation to exist. We utilized EcoSim to model elitist and random mate selection. Our results demonstrated a significantly lower speciation rate, a significantly lower extinction rate, and a significantly higher turnover rate for sexual selection groups. Species diversification was found to display no significant difference. The relationship between communication and foraging behavior similarly features opposing hypotheses in claim of both increases and decreases of foraging behavior in response to alarm communication. Through modeling with EcoSim, we found alarm communication to decrease foraging activity in most cases, yet gradually increase foraging activity in some other cases. Furthermore, we found both outcomes resulting from alarm communication to increase fitness as compared to non-communication
    • …
    corecore