4,414 research outputs found

    Prediction of student success: A smart data-driven approach

    Get PDF
    Predicting student’s academic performance is one of the subjects related to the Educational Data Mining process, which intends to extract useful information and new patterns from educational data. Understanding the drivers of student success may assist educators in developing pedagogical methods providing a tool for personalized feedback and advice. In order to improve the academic performance of students and create a decision support solution for higher education institutes, this dissertation proposed a methodology that uses educational data mining to compare prediction models for the students' success. Data belongs to ISCTE master students, a Portuguese university, during 2012 to 2022 academic years. In addition, it was studied which factors are the strongest predictors of the student’s success. PyCaret library was used to compare the performance of several algorithms. Factors that were proposed to influence the success include, for example, the student's gender, previous educational background, the existence of a special statute, and the parents' educational degree. The analysis revealed that the Light Gradient Boosting Machine Classifier had the best performance with an accuracy of 87.37%, followed by Gradient Boosting Classifier (accuracy = 85.11%) and Adaptive Boosting Classifier (accuracy = 83.37%). Hyperparameter tunning improved the performance of all the algorithms. Feature importance analysis revealed that the factors that impacted the student’s success most were the average grade, master time, and the gap between degrees, i.e., the number of years between the last degree and the start of the master.A previsão do sucesso académico de estudantes é um dos tópicos relacionados com a mineração de dados educacionais, a qual pretende extrair informação útil e encontrar padrões a partir de dados académicos. Compreender que fatores afetam o sucesso dos estudantes pode ajudar, as instituições de educação, no desenvolvimento de métodos pedagógicos, dando uma ferramenta de feedback e aconselhamento personalizado. Com o fim de melhorar o desempenho académico dos estudantes e criar uma solução de apoio à decisão, para instituições de ensino superior, este artigo propõe uma metodologia que usa mineração de dados para comparar modelos de previsão para o sucesso dos alunos. Os dados pertencem a alunos de mestrado que frequentaram o ISCTE, uma universidade portuguesa, durante os anos letivos de 2012 a 2022. Além disso, foram estudados quais os fatores que mais afetam o sucesso do aluno. Os vários algoritmos foram comparados pela biblioteca PyCaret. Alguns dos fatores que foram propostos como relevantes para o sucesso incluem, o género do aluno, a formação educacional anterior, a existência de um estatuto especial e o grau de escolaridade dos pais. A análise dos resultados demonstrou que o classificador Light Gradient Boosting Machine (LGBMC) é o que tem o melhor desempenho com uma accuracy de 87.37%, seguindo-se o classificador Gradient Boosting Classifier (accuracy=85.11%) e o classificador Adaptive Boosting (accuracy=83.37%). A afinação de hiperparâmetros melhorou o desempenho de todos os algoritmos. As variáveis que demonstraram ter maior impacto foram a média dos estudantes, a duração do mestrado e o intervalo entre estudos

    Investigating the Performance of Selected Weka Classifiers for Knowledge Discovery in Mining Educational Data

    Get PDF
    In the analyzed students’ educational data several parameters such as True Postive Rate, False Positive Rate and Classification Error were used as a yard stick in measuring the performance of both Kstar and BayeNet algorithms in mining the educational data. The performance investigation of the applied classifiers revealed hidden knowledge in the data set which was helpful in the re-calibration of the model to yield a higher precision of each of the classifier with minimal classification error. Keywords: Data Mining, Educational Data Mining, Knowledge Discovery, Student,   Classifiers, Performance, Investigation

    Credit Risk Analysis in Peer to Peer Lending Data set: Lending Club

    Get PDF
    This project studies the classification variable ‘default’ in Peer to Peer lending dataset known as Lending Club. The project improved on existing work in terms of accuracy, F-1 measure, precision, recall, and root mean squared error. We explored balancing techniques such as oversampling the minority class, undersampling the majority class, and random forests with balanced bootstraps. We also analyzed and proposed new features that improve the Learner performance

    Predictive Model for Taking Decision to Prevent University Dropout

    Get PDF
    Dropout is an educational phenomenon studied for decades due to the diversity of its causes, whose effects fall on society's development. This document presents an experimental study to obtain a predictive model that allows anticipating a university dropout. The study uses 51,497 instances with 26 attributes obtained from social sciences, administrative sciences, and engineering collected from 2010 to 2019. Artificial neural networks and decision trees were implemented as classification algorithms, and also, algorithms of attribute selection and resampling methods were used to balance the main class. The results show that the best performing model was that of Random Forest with a Matthew correlation coefficient of 87.43% against 53.39% obtained by artificial neural networks and 94.34% accuracy by Random Forest. The model has allowed predicting an approximate number of possible dropouts per period, contributing to the involved instances in preventing or reducing dropout in higher education

    Development of a system architecture for the prediction of student success using machine learning techniques

    Get PDF
    “ The goals of higher education have evolved through time based on the impact that technology development and industry have on productivity. Nowadays, jobs demand increased technical skills, and the supply of prepared personnel to assume those jobs is insufficient. The system of higher education needs to evaluate their practices to realize the potential of cultivating an educated and technically skilled workforce. Currently, completion rates at universities are too low to accomplish the aim of closing the workforce gap. Recent reports indicate that 40 percent of freshman at four-year public colleges will not graduate, and rates of completion are even lower for community colleges. Some efforts have been made to adjust admission requirements and develop systems of support for different segments of students; however, completion rates are still considered low. Therefore, new strategies need to consider student success as part of the institutional culture based on the information technology support. Also, it is key that the models that evaluate student success can be scalable to other higher education institutions. In recent years machine learning techniques have proven to be effective for such purpose. Then, the primary objective of this research is to develop an integrated system that allows for the application of machine learning for student success prediction. The proposed system was evaluated to determine the accuracy of student success predictions using several machine learning techniques such as decision trees, neural networks, support vector machines, and random forest. The research outcomes offer an important understanding about how to develop a more efficient and responsive system to support students to complete their educational goals”--Abstract, page iv

    Jointly Modeling Heterogeneous Student Behaviors and Interactions Among Multiple Prediction Tasks

    Full text link
    Prediction tasks about students have practical significance for both student and college. Making multiple predictions about students is an important part of a smart campus. For instance, predicting whether a student will fail to graduate can alert the student affairs office to take predictive measures to help the student improve his/her academic performance. With the development of information technology in colleges, we can collect digital footprints which encode heterogeneous behaviors continuously. In this paper, we focus on modeling heterogeneous behaviors and making multiple predictions together, since some prediction tasks are related and learning the model for a specific task may have the data sparsity problem. To this end, we propose a variant of LSTM and a soft-attention mechanism. The proposed LSTM is able to learn the student profile-aware representation from heterogeneous behavior sequences. The proposed soft-attention mechanism can dynamically learn different importance degrees of different days for every student. In this way, heterogeneous behaviors can be well modeled. In order to model interactions among multiple prediction tasks, we propose a co-attention mechanism based unit. With the help of the stacked units, we can explicitly control the knowledge transfer among multiple tasks. We design three motivating behavior prediction tasks based on a real-world dataset collected from a college. Qualitative and quantitative experiments on the three prediction tasks have demonstrated the effectiveness of our model

    A Hybrid Machine Learning Framework for Predicting Students’ Performance in Virtual Learning Environment

    Get PDF
    Virtual Learning Environments (VLE), such as Moodle and Blackboard, store vast data to help identify students\u27 performance and engagement. As a result, researchers have been focusing their efforts on assisting educational institutions in providing machine learning models to predict at-risk students and improve their performance. However, it requires an efficient approach to construct a model that can ultimately provide accurate predictions. Consequently, this study proposes a hybrid machine learning framework to predict students\u27 performance using eight classification algorithms and three ensemble methods (Bagging, Boosting, Voting) to determine the best-performing predictive model. In addition, this study used filter-based and wrapper-based feature selection techniques to select the best features of the dataset related to students\u27 performance. The obtained results reveal that the ensemble methods recorded higher predictive accuracy when compared to single classifiers. Furthermore, the accuracy of the models improved due to the feature selection techniques utilized in this study
    corecore