Data Mining Students' performance in a Higher Learning Environment

Abstract

Student performance in higher education has become one of the most widely studied area. While modelling students' performance, data plays a pivotal role in forecasting their performance and this is where the data mining applications are now becoming widely used. There are various factors which determine the student performance. In this study, eight attributes are used as inputs which are considered most influential in determining students' performance in the Pacific. Statistical analysis is done to find out which attribute has the highest influence to student performance. In this research, different algorithms are utilized for building the classification model, each of them using various classification techniques. The classification techniques used are Artificial Neural Network, Decision Tree, Decision Table, and Naïve Bayes. The dataset of 651 records used in this research is an imbalanced set, which is later transformed to balance set through under sampling. Neural Network is one of the classification techniques that has performed well on both, imbalanced and balanced datasets with the highest prediction accuracy of 96.8%. The analysis further shows that internal assessment has weak positive relationship with student performance while demographic data has no significant relationship

    Similar works