10 research outputs found
Analysis of the Use of Particle Swarm Optimization on Naïve Bayes for Classification of Credit Bank Applications
The selection of prospective customers who apply for credit in the banking world is a very important thing to be considered by the marketing department in order to avoid non-performing loans. The website www.kaggle.com currently provides South German Credit data in the form of supervised learning data. The use of data mining techniques makes it possible to find hidden patterns contained in large data sets, one of which is using classification modeling. This study aims to compare the classification of South German Credit data using the Naïve Bayes algorithm and compare the classification of South German Credit data using the Naïve Bayes algorithm with particle swarm optimization (PSO). The test was carried out using a confusion matrix to determine the accuracy, precision and recall values of the research model. Based on the test, it is known that PSO is able to increase the accuracy and recall of Nave Bayes, but PSO has not been able to increase the precision value of Nave Bayes. The test results show that PSO optimization gives Naïve Bayes an increase in the value of accuracy by 0.46%, and gives Naïve Bayes an increase in recall value by 3.02%.
Genetic Algorithm Optimization on Nave Bayes for Airline Customer Satisfaction Classification
Airline companies need to provide satisfactory service quality so that people do not switch to using other airlines. The way that can be used to determine customer satisfaction is to use data mining techniques. Currently, the website www.kaggle.com has provided Airline Passenger Satisfaction data consisting of 22 attributes, 1 label and 25976 instances which are included in the supervised learning data category. Based on several previous studies, the Naïve Bayes algorithm can provide better classification performance than other classification algorithms. Several studies also state that the use of Naive Bayes can be optimized using Genetic Algorithm (GA) to obtain better performance. The use of Genetic Algorithm for Nave Bayes optimization in classifying Airline Passenger Satisfaction data requires further research to ensure the performance of the given classification. This study aims to compare the use of the Naive Bayes algorithm for the classification of Airline Passenger Satisfaction with and without GA optimization. The data validation process used in this study is to use split validation to divide the dataset into 95% training data and 5% testing data. The test results show that the use of GA on Naive Bayes can improve the classification performance of Airline Passenger Satisfaction data in terms of accuracy and recall with an accuracy value of 85.99% and a recall of 87.91%
Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm
The Ministry of Villages, Development of Disadvantaged Areas and Transmigration (Ministry of Village PDTT) is a ministry within the Indonesian Government in charge of rural and rural development, empowerment of rural communities, accelerated development of disadvantaged areas, and transmigration. Village Potential Data for 2014 (Podes 2014) in West Java Province is data issued by the Central Statistics Agency in collaboration with the Ministry of Village PDTT which is in unsupervised data format, consists of 5319 village data. The Podes 2014 data in West Java Province were made based on the level of village development (village specific) in Indonesia, by making the village as the unit of analysis. Base on the Regulation of the Minister of Villages, Disadvantaged Areas and Transmigration of the Republic of Indonesia number 2 of 2016 concerning the village development index, the Village is classified into 5 village status, namely Very Disadvantaged Village, Disadvantaged Village, Developing Village, Advanced Village and Independent Village based on the ability to manage and increase the potential of social, economic and ecological resources. Village status is in fact inseparable from village development that is under government funding support. However, village development funds have not been distributed effectively and accurately according to the conditions and potential of the village due to the lack of clear information about the status of the village. Therefore, the information regarding the villages priority in term of which villages needs more funding and attention from the government is still lacking. Data mining is a method that can be used to group objects in a data into classes that have the same criteria (clustering). One of the algorithms that can be used for the clustering process is the k-means algorithm. Data grouping using k-means is done by calculating the closest distance from data to a centroid point. In this study, different types of distance calculation in the K-means algorithm are compared. Those types are Manhattan, Euclidean and Chebyshev. Validation tests have been carried out using the execution time and Davies Bouldin index. From this test, the data Village Potential 2014 in West Java province have grouped all the 5 status of the village with the obtained number of villages for each cluster is a cluster village Extremely Backward many as 694 villages, cluster Villages 567 villages, cluster village Evolving as much as 1440 villages, the cluster with Desa Maju1557 villages and the cluster Independent Village for 1061 villages. For distance calculation, Chebyshev has the most efficient accumulation time of 1 second compared to Euclidean 1.6 seconds and Manhattan 2.4 seconds. Meanwhile, the Euclidean method has the value, Davies Index most optimal which is 0.886 compared to the Manhattan method 0.926 and Chebyshev 0.990
South German Credit Data Classification Using Random Forest Algorithm to Predict Bank Credit Receipts
Normally, most of the bank's wealth is obtained from providing credit loans so that a marketing bank must be able to reduce the risk of non-performing credit loans. The risk of providing loans can be minimized by studying patterns from existing lending data. One technique that can be used to solve this problem is to use data mining techniques. Data mining makes it possible to find hidden information from large data sets by way of classification. The Random Forest (RF) algorithm is a classification algorithm that can be used to deal with data imbalancing problems. The purpose of this study is to discuss the use of the RF algorithm for classification of South German Credit data. This research is needed because currently there is no previous research that applies the RF algorithm to classify South German Credit data specifically. Based on the tests that have been done, the optimal performance of the classification algorithm RF on South German Credit data is the comparison of training data of 85% and testing data of 15% with an accuracy of 78.33%
IMPLEMENTASI ALGORITMA AFFINE CIPHER DAN VIGENERE CIPHER UNTUK KEAMANAN LOGIN SISTEM INVENTORI TB MITA JEPARA
Dalam penelitian ini dirumuskan masalah tentang bagaimana mengimplementasikan algoritma Affine cipher dan Vigenere cipher untuk keamanan login pada sistem inventori TB Mita Jepara agar dapat menjadi lebih aman dibandingkan saat menggunakan MD5. Sedangkan tujuan dari penelitian ini adalah untuk merancang keamanan login pada sistem inventori TB Mita menggunakan enkripsi Affine cipher dan Vigenere cipher, membuat enkripsi password login pada sitem inventori TB Mita menjadi lebih aman digunakan serta merancang enkripsi password login agar dapat diterapkan pada sitem inventori TB Mita menggunakan PHP. Affine cipher dan Vigenere cipher merupakan bagian dari algoritma simetris. Proses enkripsi dan proses dekripsi pada algoritma Affine cipher membutuhkan dua kunci, sedangkan proses enkripsi dan dekripsi menggunakan Vigener cipher membutuhkan satu kunci. Gabungan dari algoritma Affine cipher dan Vigenere cipher akan menghasilkan tiga kunci sehingga menjadi lebih kuat. Hasil dari keamanan password menggunakan Affine cipher dan Vigenere cipher dapat menjadi lebih kuat dibandingkan menggunakan algoritma MD5
Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger
The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.Kualitas dari pelayanan suatu maskapai tidak dapat diukur dari sudut pandang perusahaan, melainkan harus dilihat dari sudut pandang kepuasan pelanggan. Teknik data mining memungkinkan untuk melakukan prediksi kepuasan pelanggan maskapai penerbangan dengan model klasifikasi. Algoritma Naïve Bayes telah menunjukkan akurasi klasifikasi yang luar biasa, namun saat ini asumsi bebas jarang dibahas. Beberapa literatur menyarankan penggunaan pembobotan atribut untuk mengurangi asumsi bebas, dimana dapat dilakukan menggunakan Particle Swarm Optimization (PSO) dan Genetic Algorithm (GA) melalui Feature Selection. Penelitian ini melakukan perbandingan optimasi PSO dan GA pada Naïve Bayes untuk klasifikasi data Airline Passenger Satisfaction yang diambil dari www.kaggle.com. Setelah dilakukan pengujian, diperoleh performa yang paling bagus dari model yang dibentuk, yaitu klasifikasi data Airline Passenger Satisfaction menggunakan algoritma Naïve Bayes dengan optimasi PSO, dimana diperoleh nilai akurasi sebesar 86.13%, nilai presisi sebesar 87.90%, nilai recall sebesar 87.29%, dan nilai AUC sebesar 0.923
Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging
The increasing demand for credit applications to banks has motivated the banking world to switch to more sophisticated techniques for analyzing the level of credit risk. One technique for analyzing the level of credit risk is the data mining approach. Data mining provides a technique for finding meaningful information from large amounts of data by way of classification. However, bank marketing data is a type of imbalance data so that if the classification is done the results are less than optimal. The classification algorithm that can be used for imbalance data types can use naïve Bayes. Naïve Bayes performs well in terms of classification. However, optimization is needed in order to obtain more optimal classification results. Optimization techniques in handling imbalance data have been developed with several approaches. Bagging and Genetic Algorithms can be used to overcome imbalance data. This study aims to compare the accuracy level of the naïve Bayes algorithm after optimization using the bagging and genetic algorithm. The results showed that the combination of bagging and a genetic algorithm could improve the performance of Naive Bayes by 4.57%.Peningkatan permintaan pengajuan kredit pada perbankan telah memotifasi dunia perbankan untuk beralih pada teknik yang lebih canggih untuk menganalisa tingkat resiko kredit. Salah satu teknik untuk menganalisa tingkat resiko kredit adalah dengan pendekatan data mining. Data mining menyediakan teknik untuk menemukan informasi yang bermakna dari sejumlah data besar dengan cara klasifikasi. Data bank marketing termasuk jenis data imbalance sehingga apabila dilakukan klasifikasi hasilnya kurang optimal. Algoritma klasifikasi yang dapat digunakan untuk jenis data imbalance dapat menggunakan naïve bayes. Naïve bayes memiliki kinerja baik dalam hal klasifikasi, namun demikian diperlukan optimasi agar mendapatkan hasil klasifikasi yang lebih optimal. Teknik optimasi dalam menangani data imbalance telah banyak dikembangkan dengan beberapa pendekatan. Bagging dan Genetic Algorithm dapat digunakan dalam mengatasi data imbalance. Penelitian ini bertujuan untuk membandingkan tingkat akurasi algoritma naïve bayes setelah dilakukan optimasi dengan menggunakan bagging dan genetic algorithm. Hasil penelitan menunjukkan bahwa kombinasi bagging dengan genetic algorithm dapat meningkatkan performa naive bayes sebesar 4,57%.  
Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing
The world of banking requires a marketer to be able to reduce the risk of borrowing by keeping his customers from occurring non-performing loans. One way to reduce this risk is by using data mining techniques. Data mining provides a powerful technique for finding meaningful and useful information from large amounts of data by way of classification. The classification algorithm that can be used to handle imbalance problems can use the Random Forest (RF) algorithm. However, several references state that an optimization algorithm is needed to improve the classification results of the RF algorithm. Optimization of the RF algorithm can be done using Bagging and Genetic Algorithm (GA). This study aims to classify Bank Marketing data in the form of loan application receipts, which data is taken from the www.data.world site. Classification is carried out using the RF algorithm to obtain a predictive model for loan application acceptance with optimal accuracy. This study will also compare the use of optimization in the RF algorithm with Bagging and Genetic Algorithms. Based on the tests that have been done, the results show that the most optimal performance of the classification of Bank Marketing data is by using the RF algorithm with an accuracy of 88.30%, AUC (+) of 0.500 and AUC (-) of 0.000. The optimization of Bagging and Genetic Algorithm has not been able to improve the performance of the RF algorithm for classification of Bank Marketing data.
Dunia perbankan mengharuskan seorang marketing agar mampu mengurangi resiko peminjaman dengan cara menjaga nasabahnya agar tidak terjadi kredit bermasalah. Salah satu cara untuk mengurangi resiko tersebut adalah dengan menggunakan teknik data mining. Data mining menyediakan teknik yang kuat untuk menemukan informasi yang bermakna dan berguna dari sejumlah besar data dengan cara klasifikasi. Algoritma klasifikas yang dapat digunakan untuk menangani masalah imbalance dapat menggunakan algoritma Random Forest (RF). Namun demikian beberapa referensi menyebutkan bahwa diperlukan algoritma optimasi guna meningkatkan hasil klasifikasi dari algoritma RF. Optimasi algoritma RF dapat dilakukan dengan menggunakan Bagging dan Genetic Algorithm (GA). Penelitian ini bertujuan untuk mengklasifikasikan data Bank Marketing berupa penerimaan pengajuan pinjaman yang mana datanya diambil dari situs www.data.world. Klasifikasi dilakukan dengan menggunakan algoritma RF untuk memperoleh model prediksi penerimaan pengajuan pinjaman dengan akurasi yang optimal. Penelitian ini juga akan membandingkan penggunaan optimasi pada algoritma RF dengan Bagging dan Genetic Algorithm. Berdasarkan pengujian yang telah dilakukan diperoleh hasil bahwa performa paling optimal dari klasifikasi data Bank Marketing adalah dengan menggunakan algoritma RF dengan akurasi sebesar 88,30%, AUC (+) sebesar 0,500 dan AUC(-) sebesar 0,000. Adapun penggunaan optimasi Bagging dan Genetic Algorithm ternyata belum mampu meningkatkan performa dari algoritma RF untuk klasifikasi data Bank Marketing.
 
Analysis of the Use of Particle Swarm Optimization on Naïve Bayes for Classification of Credit Bank Applications
The selection of prospective customers who apply for credit in the banking world is a very important thing to be considered by the marketing department in order to avoid non-performing loans. The website www.kaggle.com currently provides South German Credit data in the form of supervised learning data. The use of data mining techniques makes it possible to find hidden patterns contained in large data sets, one of which is using classification modeling. This study aims to compare the classification of South German Credit data using the Naïve Bayes algorithm and compare the classification of South German Credit data using the Naïve Bayes algorithm with particle swarm optimization (PSO). The test was carried out using a confusion matrix to determine the accuracy, precision and recall values of the research model. Based on the test, it is known that PSO is able to increase the accuracy and recall of Nave Bayes, but PSO has not been able to increase the precision value of Nave Bayes. The test results show that PSO optimization gives Naïve Bayes an increase in the value of accuracy by 0.46%, and gives Naïve Bayes an increase in recall value by 3.02%