8 research outputs found

    Visualisasi Secara Spasial Cluster Kerusakan Sarana dan Prasarana Sekolah

    Get PDF
    Abstrak – Salah satu indikator mutu pendidikan adalah sarana dan prasarana sekolah. Keterbatasaan dana  dalam peningkatan sarana dan prasarana sekolah merupakan kendala yang harus diperhatikan pemerintah. Untuk itu, diperlukan pengelompokkan kondisi kerusakan sarana dan prasarana sekolah  sebagai rekomendasi dalam memprioritaskan sekolah penerima bantuan peningkatan sarana dan prasarana. Penelitian bertujuan untuk mengelompokkan kondisi sarana dan prasarana sekolah menggunakan metodologi CRISP-DM sebagai proses data mining serta algoritma k-means sebagai proses clustering. Hasil yang diperoleh adalah terbentuk 3 buah cluster yang mewakili kondisi sarana dan prasarana sekolah. Hasil cluster kemudian divisualisasikan menggunakan open source web gis dan terlihat bahwa sebaran cluster merata di setiap kecamatan.Kata kunci:  Sarana dan Prasarana Sekolah, CRISP-DM, K-means, Webgis Abstract - One indicator of the quality of education is the school facilities and infrastructure. Limited funds for improvement of school facilities and infrastructure is a constraint that must be considered by the government. Therefore, a grouping condition school facilities and infrastructure damage as a recommendation in the recipient schools prioritize facility and infrastructure improvement. The study aims to classify the condition of school facilities and infrastructure using CRISP-DM for methodology data mining and k-means algorithm for clustering process. The results obtained are formed three clusters that represent the condition of school facilities and infrastructure. The results of the cluster visualized using an open source web gis and seen that the uneven distribution of clusters in each district.   Keyword : school facilities and infrastructure, CRISP-DM, K-means, Webgi

    Backward Elimination for Feature Selection on Breast Cancer Classification Using Logistic Regression and Support Vector Machine Algorithms

    Get PDF
    Breast cancer is a prevalent form of cancer that afflicts women across all nations globally. One of the ways that can be done as a prevention to reduce elevated fatality due to breast cancer is with a detection system that can determine whether a cancer is benign or malignant. Logistic Regression and Support Vector Machine (SVM) classification algorithms are often used to detect this disease, but the use of these two algorithms often doesn’t give optimal results when applied to datasets that have many features, so additional algorithm is needed to improve classification performance by using Backward Elimination feature selection. The comparison of Logistic Regression and SVM algorithms was carried out by applying feature selection to breast cancer data to see the best model. The breast cancer dataset has 30 features and two classes, Benign and Malignant. Backward Elimination has reduced features from 30 features to 13 features, thereby increasing the performance of both classification models. The best classification was obtained by using the Backward Elimination feature selection and linear kernel SVM with an increase in accuracy value from 96.14% to 97.02%, precision from 98.06% to 99.49%, recall from 90.48% to 92.38%, and the AUC from 0.95 to 0.96

    Comparison of shallow and deep learning methods of ECG signals clas-sification for arrhythmia detection

    Get PDF
    The research aimed to compare the classification performance of arrhythmia classification from the ECG signal dataset from the Massachusetts Institute of Technology–Beth Israel Hospital (MIT-BIH) database. Shallow learning methods that were used in this study are Support Vector Machine,  Naïve Bayes, and Random Forest. 1D Convolutional Neural Network (1D CNN), Long Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) were deep learning methods that were used for the study. The models were tested on a dataset with 140 samples that were grouped into four class labels, and each sample has 2160 features. Those models were tested for classification performance. This research shows Random Forest and 1D CNN have the best performance

    Intrusion Detection System Berbasis Seleksi Fitur Dengan Kombinasi Filter Information Gain Ratio Dan Correlation

    Get PDF
    Intrusion Detection System merupakan suatu sistem yang dikembangkan untuk memantau dan memfilter aktivitas jaringan dengan mengidentifikasi serangan. Karena jumlah data yang perlu diperiksa oleh IDS sangat besar dan banyaknya fitur-fitur asing yang dapat membuat proses analisis menjadi sulit untuk mendeteksi pola perilaku yang mencurigakan, maka IDS perlu mengurangi jumlah data yang akan diproses dengan cara mengurangi fitur yang dapat dilakukan dengan seleksi fitur. Pada penelitian ini mengkombinasikan dua metode perangkingan fitur yaitu Information Gain Ratio dan Correlation dan mengklasifikasikannya menggunakan algoritma K-Nearest Neighbor. Hasil perankingan dari kedua metode dibagi menjadi dua kelompok. Pada kelompok pertama dicari nilai mediannya dan untuk kelompok kedua dihapus. Lalu dilakukan klasifikasi K-Nearest Neighbor dengan menggunakan 10 kali validasi silang dan dilakukan pengujian dengan nilai k=5. Penerapan pemodelan yang diusulkan menghasilkan akurasi tertinggi sebesar 99.61%. Sedangkan untuk akurasi tanpa seleksi fitur menghasilkan akurasi tertinggi sebesar 99.59%. AbstractIntrusion Detection System is a system that was developed for monitoring and filtering activity in network with identified of attack. Because of the amount of the data that need to be checked by IDS is very large and many foreign feature that can make the analysis process difficult for detection suspicious pattern of behavior, so that IDS need for reduce amount of the data to be processed by reducing features that can be done by feature selection. In this study, combines two methods of feature ranking is Information Gain Ratio and Correlation and classify it using K-Nearest Neighbor algorithm. The result of feature ranking from the both methods divided into two groups. in the first group searched for the median value and in the second group is removed. Then do the classification of  K-Nearest Neighbor using 10 fold cross validation and do the tests with values k=5. The result of the  proposed modelling produce the highest accuracy of 99.61%. While the highest accuracy value of the not using the feature selection is 99.59%

    Application of Extreme Learning Machine Method With Particle Swarm Optimization to Classify of Heart Disease

    Get PDF
    Penyakit jantung koroner adalah tersumbatnya suplai darah jantung. Penyakit jantung adalah penyebab utama kematian di seluruh dunia. Berbagai faktor risiko berkontribusi terhadap penyakit jantung, termasuk merokok, gaya hidup tidak sehat, kolesterol tinggi, dan hipertensi. Dengan demikian, prediksi penyakit dapat dilakukan untuk mengidentifikasi individu yang berisiko guna mencegah peningkatan kematian akibat penyakit jantung. Penambangan data, khususnya metode Extreme Machine Learning (ELM), biasanya digunakan untuk tujuan ini. ELM adalah metode jaringan saraf dalam kecepatan pelatihan dan tidak memerlukan propagasi balik, dan menentukan jumlah node tersembunyi yang optimal dan mencapai hasil yang akurat tetap menjadi tantangan. Pada penelitian ini, ELM dengan Particle Swarm Optimization (PSO) diusulkan untuk mengoptimalkan klasifikasi penyakit jantung, yang bertujuan untuk mencapai hasil optimal dengan pembelajaran cepat. Penelitian ini mengikuti proses yang sistematis, termasuk pengumpulan data, preprocessing, pemodelan, dan evaluasi menggunakan analisis matriks konfusi. Hasil dan pembahasan menyajikan efektivitas metode yang diusulkan dengan mengevaluasi akurasi klasifikasi berdasarkan berbagai parameter, seperti ukuran populasi, jumlah node tersembunyi, dan iterasi. Temuan menunjukkan bahwa ELM dengan optimasi PSO dapat memberikan hasil klasifikasi yang akurat untuk diagnosis penyakit jantung, dengan tingkat akurasi yang menjanjikan

    Gender Classification Based on Electrocardiogram Signals Using Long Short Term Memory and Bidirectional Long Short Term Memory

    Get PDF
    Gender classification by computer is essential for applications in many domains, such as human-computer interaction or biometric system applications. Generally, gender classification by computer can be done by using a face photo, fingerprint, or voice. However, researchers have demonstrated the potential of the electrocardiogram (ECG) as a biometric recognition and gender classification. In facilitating the process of gender classification based on ECG signals, a method is needed, namely Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (Bi-LSTM). Researchers use these two methods because of the ability of these two methods to deal with sequential problems such as ECG signals. The inputs used in both methods generally use one-dimensional data with a generally large number of signal features. The dataset used in this study has a total of 10,000 features. This research was conducted on changing the input shape to determine its effect on classification performance in the LSTM and Bi-LSTM methods. Each method will be tested with input with 11 different shapes. The best accuracy results obtained are 79.03% with an input shape size of 100×100 in the LSTM method. Moreover, the best accuracy in the Bi-LSTM method with input shapes of 250×40 is 74.19%. The main contribution of this study is to share the impact of various input shape sizes to enhance the performance of gender classification based on ECG signals using LSTM and Bi-LSTM methods. Additionally, this study contributes for selecting an appropriate method between LSTM and Bi-LSTM on ECG signals for gender classification

    Effect of SMOTE Variants on Software Defect Prediction Classification Based on Boosting Algorithm

    No full text
    Detecting software defects early on is critical for avoiding significant financial losses. However, building accurate software defect prediction models can be challenging due to class imbalance, where the data for defective modules is much less than for standard modules. This research addresses this issue using the imbalanced dataset NASA MDP. To address this issue, researchers have proposed new methods that combine data level balancing approaches with 14 variations of the SMOTE algorithm to increase the amount of defective module data. An algorithm-level approach with three boosting algorithms, Catboost, LightGBM, and Gradient Boosting, is applied to classify modules as defective or non-defective. These methods aim to improve the accuracy of software defect prediction. The results show that this new method can produce a more accurate classification than previous studies. The DSMOTE and Gradient Boosting pair with 0.9161 has the highest average accuracy (0.9161). The DSMOTE and Catboost model achieved the highest average AUC value (0.9637). The ADASYN kernel and Catboost showed the best ability to perform the average G-mean value (0.9154). The research contribution to software defect prediction involves developing new techniques and evaluating their effectiveness in addressing class imbalance

    Gender Classification Based on Electrocardiogram Signals Using Long Short Term Memory and Bidirectional Long Short Term Memory

    Get PDF
    Gender classification by computer is essential for applications in many domains, such as human-computer interaction or biometric system applications. Generally, gender classification by computer can be done by using a face photo, fingerprint, or voice. However, researchers have demonstrated the potential of the electrocardiogram (ECG) as a biometric recognition and gender classification. In facilitating the process of gender classification based on ECG signals, a method is needed, namely Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (Bi-LSTM). Researchers use these two methods because of the ability of these two methods to deal with sequential problems such as ECG signals. The inputs used in both methods generally use one-dimensional data with a generally large number of signal features. The dataset used in this study has a total of 10,000 features. This research was conducted on changing the input shape to determine its effect on classification performance in the LSTM and Bi-LSTM methods. Each method will be tested with input with 11 different shapes. The best accuracy results obtained are 79.03% with an input shape size of 100×100 in the LSTM method. Moreover, the best accuracy in the Bi-LSTM method with input shapes of 250×40 is 74.19%. The main contribution of this study is to share the impact of various input shape sizes to enhance the performance of gender classification based on ECG signals using LSTM and Bi-LSTM methods. Additionally, this study contributes for selecting an appropriate method between LSTM and Bi-LSTM on ECG signals for gender classification
    corecore