    The problem in this study is that the mobile web-based career center information system with the breadth first search method does not yet exist on the STMIK Musi Rawas Lubuklinggau campus. Just simply sticking a piece of information on job vacancies on campus campus. This research uses the waterfall system development method and data collection by conducting data collection techniques which are observing (Observation), Question and Answer (Interview), Documentation and Literature Study. The results of this study produce a mobile web-based career center information system using the PHP programming language, and a database using MySQL. It can be concluded that the information system can help STMIK Alumni Musirawas Lubuklinggau find information on job vacancies through the career center information system, and the university can also provide information on available job openings

    Prediction of Student Decisions in Choosing the Type of Bank Using Support Vector Machine (SVM)

    A bank is an intermediate financial institution authorized to take deposits, lend money, and issue promissory notes or banknotes. In the present day, every adult must have at least one bank account. Additionally, bank services range from regular and hajj savings to large-scale loans. Students, one of the bank’s customers, usually utilize services confined to savings to preserve pocket money received from their parents and ordinary transactions like transfers and payments. Several factors, including the atmosphere, administrative fees, and the accessibility of ATMs and bank branch offices, impact students’ decisions about where to save money. It prevents the bank from predicting which services must be enhanced to encourage customers, particularly students, to select the bank. Therefore, prediction is required to ascertain the students’ choice of bank. This study employed data mining and the Support Vector Machine (LibSVM) algorithm. The quantity of data impacted the outcomes of the SVM classification. In addition, kernel types, k-fold values, and sampling techniques also influenced classification accuracy. LibSVM with a kernel type of RBF, a k-fold of 8, and shuffled sampling classified 200 data with an accuracy of 68.40%

    Teknik Weighting untuk Mengatasi Ketidakseimbangan Kelas Pada Prediksi Churn Menggunakan XGBoost, LightGBM, dan CatBoost

    Churn merupakan kondisi dimana seseorang berpindah dari satu layanan ke layanan yang lain. Churn pelanggan menjadi masalah yang meningkat cukup signifikan dan menjadi tantangan utama yang harus dihadapi banyak perusahaan perbankan karena memiki peran penting terhadap laba perusahaan.  Oleh sebab itu, diperlukan cara untuk memprediksi perilaku churn tepat waktu agar bisa menerapkan retensi pelanggan. Namun, Permasalahan yang dihadapi oleh model prediksi churn adalah ketidakseimbangan kelas sehingga membuat model klasifikasi menghasilkan kinerja yang buruk. Solusi yang paling sering digunakan untuk mengatasi masalah ketidakseimbangan kelas terbagi menjadi tiga pendekatan yaitu pendekatan level data, level algoritma dan  ensemble. Setiap pendekatan  mengalami beberapa masalah yang sulit diprediksi ketika digunakan untuk menangani masalah ketidakseimbangan kelas. Pada penelitian ini, peneliti melakukan eksperimen menggunakan metode ensemble berbasis boosting untuk melakukan prediksi churn pelanggan dan mencoba meningkatkan kinerjanya pada dataset yang tidak seimbang dengan parameter tuning menggunakan scale pos weight. Algoritma klasifikasi yang digunakan yaitu XGBoost (extreme gradient boosting), LightGBM (light gradient boosting machine) dan CatBoost. Hasil eksperimen akan membandingkan kinerja dari ketiga algoritma berbasis boosting tersebut dengan menyesuaikan bobot parameternya sebanyak tiga kali. Dari hasil pengujian, model CatBoost memperoleh nilai recall tertinggi sebesar 0.79. Sedangkan untuk nilai recall terendah adalah model CatBoost default dengan nilai 0.47. Bedasarkan hasil ekperimen dapat disimpulan bahwa model bekerja dengan cukup baik pada data yang tidak seimbang dengan memberikan mekanisme hyperparameter scale pos weightsehingga model dapat lebih fokus pada kelas minoritas yang sulit dideteksi. 

    Computational Efficiency Analysis of Customer Churn Prediction Using Spark and Caret Random Forest Classifier

    Today’s businesses are buying into technological advancement for productivity, profit maximization and better service delivery. Meanwhile technology as also brought about data coming in at an alarming rate in which businesses need to re-strategize how these data are being handled for them to retain ability to turn them to value. Traditional data mining techniques has proofed beyond doubt that data can be harnessed and turn into value for business growth. But the era of large scale data is posing a challenge of computational efficiency to this traditional approach. This paper therefore address this issue by under-studying a big data analytics tool-Spark with a data mining technique Caret. A churn Telecom dataset was used to analyse both the computational and performance metrics of the two approaches using their Random Forest (RF) classifier. The Classifier was trained with same the train set partitioning and tuning parameters. The result shows that Spark-RF is computational efficient with execution time of 50.25 secs compared to Caret-RF of 847.20 secs. Customer churning rate could be minimized if proper management attention and policy is paid to tenure (ShortTenure), Contract, InternetService and PaymentMethod as the variable importance plot and churn rate count mechanism confirm that. The Classifier accuracy was approximately 80% for both implementation. Keywords: Spark, Caret, Random Forest, Churn, accurac

    Customer Churn Prediction Model Using Artificial Neural Networks (ANN): A Case Study in Banking

    Customer Churn has a great impact on banking industries as it accelerates a loss of revenue and customer loyalty. The focus of the research is to create a model for the banking sector using Artificial Neural Networks (ANNs) which can predict if the customer will churn. The prediction is based on the input features and the independent variable of the trained dataset. The hyperparameters are altered during model training using the forward propagation algorithm and cross-validation techniques which enable the model to perform well with respect to accuracy and precision rate. The achieved results illustrate that the suggested model has an accuracy of 86% at predicting customer attrition. In comparison to the logistic regression model outcomes, ANN models are more effective for predicting customer churn in the banking industry. The study suggests vital perceptions of how to employ machine learning approaches to increase client retention and decrease customer churn. Banks can use this model to spot clients who are at risk of churning and take proactive measures to keep them

    Customer Churn Prediction in Telecommunication Industry Using Classification and Regression Trees and Artificial Neural Network Algorithms

    Customer churn is a serious problem, which is a critical issue encountered by large businesses and organizations. Due to the direct impact on the company's revenues, particularly in sectors such as the telecommunications as well as the banking, companies are working to promote ways to identify the churn of prospective consumers. Hence it is vital to investigate issues that influence customer churn to yield appropriate measures to diminish churn. The major objective of this work is to advance a model of churn prediction that helps telecom operatives to envisage clients that are most probable to be subjected to churn. The experimental approach for this study uses the machine learning procedures on the telecom churn dataset, using an improved Relief-F feature selection algorithm to pick related features from the huge dataset. To quantify the model's performance, the result of classification uses CART and ANN, the accuracy shows that ANN has a high predictive capacity of 93.88% compared to the 91.60% CART classifie

    Designing Efficient and Sustainable Predictions of Water Quality Indexes at the Regional Scale Using Machine Learning Algorithms

    Water quality and scarcity are key topics considered by the Sustainable Development Goals (SDGs), institutions, policymakers and stakeholders to guarantee human safety, but also vital to protect natural ecosystems. However, conventional approaches to deciding the suitability of water for drinking purposes are often costly because multiple characteristics are required, notably in low-income countries. As a result, building right and trustworthy models is mandatory to correctly manage available groundwater resources. In this research, we propose to check multiple classification techniques such as Decision Trees (DT), K-Nearest Neighbors (KNN), Discriminants Analysis (DA), Support Vector Machine (SVM), and Ensemble Trees (ET) to design the best strategy allowing the forecast a Water Quality Index (WQI). To achieve this goal, an extended dataset characterized by water samples collected in a total of twelve municipalities of the Wilaya of Naâma in Algeria was considered. Among them, 151 samples were examined as training samples, and 18 were used to test and confirm the prediction model. Later, data samples were classified based on the WQI into four states: excellent water quality, good water quality, poor water quality, and very poor or unsafe water. The main results revealed that the SVM classifier obtained the highest forecast accuracy, with 95.4% of prediction accuracy when the data are standardized and 88.9% for the accuracy of the test samples. The results confirmed that the use of machine learning models are powerful tools for forecasting drinking water as larger scales to promote the design of efficient and sustainable water quality control and support decision-plans.This work is part of PRFU project N° E04N01CU450120220001. The authors gratefully acknowledge the support of the General Directorate of Scientific Research and Technological Development (DGRSDT) and to Taif University Researchers Supporting Project TURSP 2020/34, Taif University, Taif, Saudi Arabia

    Reconhecimento de ações judiciais relacionadas ao consumo de energia não registrado utilizando engenharia de características de relacionamento temporalizadas e rede LSTM/ Recognition of lawsuits related to unregistered energy consumption using temporalized relationship characteristics engineering and LSTM network

    O grande número de ações judiciais contra empresas de distribuição de energia destaca o difícil problema de identificar e solucionar falhas de serviços neste setor. Este trabalho propõe uma metodologia para identificar novas ações judiciais no setor de energia baseado em informações do relacionamento cliente com a companhia, além da identificação de fatores correlacionados. A metodologia é basicamente dividida em 4 etapas: (a) aquisição de dados; (b) engenharia de características; (c) seleção de características; e (d) classificação usando LSTM. O método foi realizado em um banco de dados com mais de cinquenta mil consumidores e mostra-se robusto na tarefa de identificar a predição de ações judiciais de consumo de energia não registrada por meio de uma acurácia de 92,89%; especificidade de 94,27%; sensibilidade de 88,79%; e precisão de 83,84%. Assim, demonstra-se a viabilidade de usar o LSTM para resolver o problema da predição de processos judiciais de consumo de energia não registrados