Search CORE

4,414 research outputs found

Prediction of student success: A smart data-driven approach

Author: Pinto Ana Rosa Almeida
Publication venue
Publication date: 16/12/2022
Field of study

Predicting student’s academic performance is one of the subjects related to the Educational Data Mining process, which intends to extract useful information and new patterns from educational data. Understanding the drivers of student success may assist educators in developing pedagogical methods providing a tool for personalized feedback and advice. In order to improve the academic performance of students and create a decision support solution for higher education institutes, this dissertation proposed a methodology that uses educational data mining to compare prediction models for the students' success. Data belongs to ISCTE master students, a Portuguese university, during 2012 to 2022 academic years. In addition, it was studied which factors are the strongest predictors of the student’s success. PyCaret library was used to compare the performance of several algorithms. Factors that were proposed to influence the success include, for example, the student's gender, previous educational background, the existence of a special statute, and the parents' educational degree. The analysis revealed that the Light Gradient Boosting Machine Classifier had the best performance with an accuracy of 87.37%, followed by Gradient Boosting Classifier (accuracy = 85.11%) and Adaptive Boosting Classifier (accuracy = 83.37%). Hyperparameter tunning improved the performance of all the algorithms. Feature importance analysis revealed that the factors that impacted the student’s success most were the average grade, master time, and the gap between degrees, i.e., the number of years between the last degree and the start of the master.A previsão do sucesso académico de estudantes é um dos tópicos relacionados com a mineração de dados educacionais, a qual pretende extrair informação útil e encontrar padrões a partir de dados académicos. Compreender que fatores afetam o sucesso dos estudantes pode ajudar, as instituições de educação, no desenvolvimento de métodos pedagógicos, dando uma ferramenta de feedback e aconselhamento personalizado. Com o fim de melhorar o desempenho académico dos estudantes e criar uma solução de apoio à decisão, para instituições de ensino superior, este artigo propõe uma metodologia que usa mineração de dados para comparar modelos de previsão para o sucesso dos alunos. Os dados pertencem a alunos de mestrado que frequentaram o ISCTE, uma universidade portuguesa, durante os anos letivos de 2012 a 2022. Além disso, foram estudados quais os fatores que mais afetam o sucesso do aluno. Os vários algoritmos foram comparados pela biblioteca PyCaret. Alguns dos fatores que foram propostos como relevantes para o sucesso incluem, o género do aluno, a formação educacional anterior, a existência de um estatuto especial e o grau de escolaridade dos pais. A análise dos resultados demonstrou que o classificador Light Gradient Boosting Machine (LGBMC) é o que tem o melhor desempenho com uma accuracy de 87.37%, seguindo-se o classificador Gradient Boosting Classifier (accuracy=85.11%) e o classificador Adaptive Boosting (accuracy=83.37%). A afinação de hiperparâmetros melhorou o desempenho de todos os algoritmos. As variáveis que demonstraram ter maior impacto foram a média dos estudantes, a duração do mestrado e o intervalo entre estudos

Repositório Institucional do ISCTE-IUL

Investigating the Performance of Selected Weka Classifiers for Knowledge Discovery in Mining Educational Data

Author: A.B Adetunji
A.Q Ayinde
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 28/05/2015
Field of study

In the analyzed students’ educational data several parameters such as True Postive Rate, False Positive Rate and Classification Error were used as a yard stick in measuring the performance of both Kstar and BayeNet algorithms in mining the educational data. The performance investigation of the applied classifiers revealed hidden knowledge in the data set which was helpful in the re-calibration of the model to yield a higher precision of each of the classifier with minimal classification error. Keywords: Data Mining, Educational Data Mining, Knowledge Discovery, Student, Classifiers, Performance, Investigation

International Institute for Science, Technology and Education (IISTE): E-Journals

Credit Risk Analysis in Peer to Peer Lending Data set: Lending Club

Author: Bokhari Mohammad Mubasil
Publication venue: Bard Digital Commons
Publication date: 01/01/2019
Field of study

This project studies the classification variable ‘default’ in Peer to Peer lending dataset known as Lending Club. The project improved on existing work in terms of accuracy, F-1 measure, precision, recall, and root mean squared error. We explored balancing techniques such as oversampling the minority class, undersampling the majority class, and random forests with balanced bootstraps. We also analyzed and proposed new features that improve the Learner performance

Bard College

Predictive Model for Taking Decision to Prevent University Dropout

Author: Méndez-Ortega Luis A.
Urbina-Nájera Argelia B.
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2022
Field of study

Dropout is an educational phenomenon studied for decades due to the diversity of its causes, whose effects fall on society's development. This document presents an experimental study to obtain a predictive model that allows anticipating a university dropout. The study uses 51,497 instances with 26 attributes obtained from social sciences, administrative sciences, and engineering collected from 2010 to 2019. Artificial neural networks and decision trees were implemented as classification algorithms, and also, algorithms of attribute selection and resampling methods were used to balance the main class. The results show that the best performing model was that of Random Forest with a Matthew correlation coefficient of 87.43% against 53.39% obtained by artificial neural networks and 94.34% accuracy by Random Forest. The model has allowed predicting an approximate number of possible dropouts per period, contributing to the involved instances in preventing or reducing dropout in higher education

Re-UNIR

DIALNET

Development of a system architecture for the prediction of student success using machine learning techniques

Author: Cardona Tatiana A.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2020
Field of study

“ The goals of higher education have evolved through time based on the impact that technology development and industry have on productivity. Nowadays, jobs demand increased technical skills, and the supply of prepared personnel to assume those jobs is insufficient. The system of higher education needs to evaluate their practices to realize the potential of cultivating an educated and technically skilled workforce. Currently, completion rates at universities are too low to accomplish the aim of closing the workforce gap. Recent reports indicate that 40 percent of freshman at four-year public colleges will not graduate, and rates of completion are even lower for community colleges. Some efforts have been made to adjust admission requirements and develop systems of support for different segments of students; however, completion rates are still considered low. Therefore, new strategies need to consider student success as part of the institutional culture based on the information technology support. Also, it is key that the models that evaluate student success can be scalable to other higher education institutions. In recent years machine learning techniques have proven to be effective for such purpose. Then, the primary objective of this research is to develop an integrated system that allows for the application of machine learning for student success prediction. The proposed system was evaluated to determine the accuracy of student success predictions using several machine learning techniques such as decision trees, neural networks, support vector machines, and random forest. The research outcomes offer an important understanding about how to develop a more efficient and responsive system to support students to complete their educational goals”--Abstract, page iv

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Recommended from our members

Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research

Author: Beemer Joshua
Fan Juanjuan
Levine Richard A.
Lingjun He
Stronach Jeanne
Publication venue: ScholarWorks@UMass Amherst
Publication date: 25/11/2019
Field of study

In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of random forest in circumstances where the regression assumptions are often violated in big data applications. Random forest is a model averaging procedure where each tree is constructed based on a bootstrap sample of the data set. In particular, we emphasize the ease of application, low computational cost, high predictive accuracy, flexibility, and interpretability of random forest machinery. Our overall recommendation is that institutional researchers look beyond classical regression and single decision tree analytics tools, and consider random forest as the predominant method for prediction tasks. The proposed points of view are detailed and illustrated through a simulation experiment and analyses of data from real institutional research projects. Accessed 3,712 times on https://pareonline.net from January 13, 2018 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right

ScholarWorks@UMass Amherst

Jointly Modeling Heterogeneous Student Behaviors and Interactions Among Multiple Prediction Tasks

Author: Liu Haobing
Tang Feilong
Xu Yanan
Yu Jiadi
Zang Tianzi
Zhu Yanmin
Publication venue
Publication date: 24/03/2021
Field of study

Prediction tasks about students have practical significance for both student and college. Making multiple predictions about students is an important part of a smart campus. For instance, predicting whether a student will fail to graduate can alert the student affairs office to take predictive measures to help the student improve his/her academic performance. With the development of information technology in colleges, we can collect digital footprints which encode heterogeneous behaviors continuously. In this paper, we focus on modeling heterogeneous behaviors and making multiple predictions together, since some prediction tasks are related and learning the model for a specific task may have the data sparsity problem. To this end, we propose a variant of LSTM and a soft-attention mechanism. The proposed LSTM is able to learn the student profile-aware representation from heterogeneous behavior sequences. The proposed soft-attention mechanism can dynamically learn different importance degrees of different days for every student. In this way, heterogeneous behaviors can be well modeled. In order to model interactions among multiple prediction tasks, we propose a co-attention mechanism based unit. With the help of the stacked units, we can explicitly control the knowledge transfer among multiple tasks. We design three motivating behavior prediction tasks based on a real-world dataset collected from a college. Qualitative and quantitative experiments on the three prediction tasks have demonstrated the effectiveness of our model

arXiv.org e-Print Archive

A Hybrid Machine Learning Framework for Predicting Students’ Performance in Virtual Learning Environment

Author: Evangelista Edmund
Publication venue: ZU Scholars
Publication date: 21/12/2021
Field of study

Virtual Learning Environments (VLE), such as Moodle and Blackboard, store vast data to help identify students\u27 performance and engagement. As a result, researchers have been focusing their efforts on assisting educational institutions in providing machine learning models to predict at-risk students and improve their performance. However, it requires an efficient approach to construct a model that can ultimately provide accurate predictions. Consequently, this study proposes a hybrid machine learning framework to predict students\u27 performance using eight classification algorithms and three ensemble methods (Bagging, Boosting, Voting) to determine the best-performing predictive model. In addition, this study used filter-based and wrapper-based feature selection techniques to select the best features of the dataset related to students\u27 performance. The obtained results reveal that the ensemble methods recorded higher predictive accuracy when compared to single classifiers. Furthermore, the accuracy of the models improved due to the feature selection techniques utilized in this study

ZU Scholars (Zayed University)