15 research outputs found

    A Comparative Analysis of the Capabilities of Nature-inspired Feature Selection Algorithms in Predicting Student Performance

    Full text link
    Predicting student performance is key in leveraging effective pre-failure interventions for at-risk students. In this paper, I have analyzed the relative performance of a suite of 12 nature-inspired algorithms when used to predict student performance across 3 datasets consisting of instance-based clickstream data, intra-course single-course performance, and performance when taking multiple courses simultaneously. I found that, for all datasets, leveraging an ensemble approach using NIAs for feature selection and traditional ML algorithms for classification increased predictive accuracy while also reducing feature set size by 2/3.Comment: Draf

    Forecasting model with machine learning in higher education ICFES exams

    Get PDF
    In this paper, we proposed to make different forecasting models in the University education through the algorithms K-means, K-closest neighbor, neural network, and naïve Bayes, which apply to specific exams of engineering, licensed and scientific mathematical thinking in Saber Pro of Colombia. ICFES Saber Pro is an exam required for the degree of all students who carry out undergraduate programs in higher education. The Colombian government regulated this exam in 2009 in the decree 3963 intending to verify the development of competencies, knowledge level, and quality of the programs and institutions. The objective is to use data to convert into information, search patterns, and select the best variables and harness the potential of data (average 650.000 data per semester). The study has found that the combination of features was: women have greater participation (68%) in Mathematics, Engineering, and Teaching careers, the urban area continues to be the preferred place to apply for higher studies (94%), Internet use increased by 50% in the last year, the support of the family nucleus is still relevant for the support in the formation of the children

    Improved Lion Optimization based Enhanced Computation Analysis and Prediction Strategy for Dropout and Placement Performance Using Big Data

    Get PDF
    Background: Predicting the undergraduate’s placement performance is vital as it impacts the credibility of educational institutions. Hence, it is significant to predict their performance based on placement in the early days of degree program. Objectives: The study intends to predict the undergraduate’s placement performance through the introduced ANN-R (Artificial Neural Network based Regression) as it is able to handle fault tolerance. For efficient prediction, relevant feature selection is needed that is performed by the proposed ILO (Improved Lion Optimization) algorithm as it has the ability to find nearest probable optimal solution. Methodology: Initially, the parameters and population are initialised. Subsequently, first best-agent is stated in accordance with fitness function. Subsequently, position of present search agent is updated. This iteration continues until all the features are selected and optimized result is attained. Here best score is computed using the proposed ILO for feature selection. Finally, the dropout analysis and placement performance of students is predicted using the introduced ANN-R through a train and test split. Results/Conclusion: Performance of the proposed system is analysed in accordance with loss metrics. Additionally, internal comparison is performed to find the extent to which the actual and predicted values correlate with one another during prediction using the existing and proposed system. The outcomes revealed that the proposed system has the ability to predict the student’s placement performance along with domain of interest with minimum errors than the traditional system. This makes the proposed system to be highly suitable for predicting student’s performance

    Improved Lion Optimization based Enhanced Computation Analysis and Prediction Strategy for Dropout and Placement Performance Using Big Data

    Get PDF
    Background: Predicting the undergraduate’s placement performance is vital as it impacts the credibility of educational institutions. Hence, it is significant to predict their performance based on placement in the early days of degree program. Objectives: The study intends to predict the undergraduate’s placement performance through the introduced ANN-R (Artificial Neural Network based Regression) as it is able to handle fault tolerance. For efficient prediction, relevant feature selection is needed that is performed by the proposed ILO (Improved Lion Optimization) algorithm as it has the ability to find nearest probable optimal solution. Methodology: Initially, the parameters and population are initialised. Subsequently, first best-agent is stated in accordance with fitness function. Subsequently, position of present search agent is updated. This iteration continues until all the features are selected and optimized result is attained. Here best score is computed using the proposed ILO for feature selection. Finally, the dropout analysis and placement performance of students is predicted using the introduced ANN-R through a train and test split. Results/Conclusion: Performance of the proposed system is analysed in accordance with loss metrics. Additionally, internal comparison is performed to find the extent to which the actual and predicted values correlate with one another during prediction using the existing and proposed system. The outcomes revealed that the proposed system has the ability to predict the student’s placement performance along with domain of interest with minimum errors than the traditional system. This makes the proposed system to be highly suitable for predicting student’s performance

    Early-warning prediction of student performance and engagement in open book assessment by reading behavior analysis

    Get PDF
    Digitized learning materials are a core part of modern education, and analysis of the use can offer insight into the learning behavior of high and low performing students. The topic of predicting student characteristics has gained a lot of attention in recent years, with applications ranging from affect to performance and at-risk student prediction. In this paper, we examine students reading behavior using a digital textbook system while taking an open-book test from the perspective of engagement and performance to identify the strategies that are used. We create models to predict the performance and engagement of learners before the start of the assessment and extract reading behavior characteristics employed before and after the start of the assessment in a higher education setting. It was found that strategies, such as: revising and previewing are indicators of how a learner will perform in an open ebook assessment. Low performing students take advantage of the open ebook policy of the assessment and employ a strategy of searching for information during the assessment. Also compared to performance, the prediction of overall engagement has a higher accuracy, and therefore could be more appropriate for identifying intervention candidates as an early-warning intervention system

    Predicción temprana de deserción mediante aprendizaje automático en cursos profesionales en línea

    Get PDF
    Despite the advantages of e-learning, this way of learning is prone to dropping out. Previous studies show that machine-learning techniques can be applied to records of interactions between students and the platform to predict abandonment. In this line, this work tries to find predictive dropout models in virtual courses that last between six and sixteen weeks, using Moodle logs from the first two. Models’ sensitivity, specificity and precision were evaluated, but priority was given to the extent to which these models made it easier to avoid attrition through cost-effective retention actions. Specifically, data from several cohorts of four courses with different themes and durations were used. All four dictated by the Secretariat of Extension of the National Technological University of the Argentine Republic, Regional Buenos Aires between February 2018 and October 2019. Different algorithms were used to generate predictive models and optimize them in order to mitigate the economic losses caused by attrition. It was analyzed if any one in particular generated the best models for all courses. It was studied whether it was convenient to build separate models per course or one for the entire data set of the four courses. It was found that it is possible to build successful predictive models and that the algorithm that produced the best models was a neural network in three of the four courses. The model that fit each one separately turned out better.A pesar de las ventajas del e-learning, esta modalidad de aprendizaje es proclive a la deserción. Estudios anteriores mostraron que se pueden aplicar técnicas de aprendizaje automático a los registros de interacciones entre estudiantes y la plataforma para predecir el abandono. En esa línea, este trabajo intenta encontrar modelos predictivos de deserción en cursos virtuales que duran entre seis y dieciséis semanas, utilizando registros de Moodle correspondientes a las dos primeras. Se evaluó la sensibilidad, especificidad y precisión de los modelos, pero se priorizó más en qué medida dichos modelos facilitaban evitar la deserción mediante acciones de retención efectivas en costo. Específicamente, se usaron datos de varias cohortes de cuatro cursos de temáticas y duraciones distintas, dictados por la Secretaría de Extensión de la Universidad Tecnológica Nacional de la República Argentina, Regional Buenos Aires, entre febrero de 2018 y octubre de 2019. Se usaron distintos algoritmos para generar modelos predictivos y optimizarlos hacia la mitigación de la pérdida económica causada por la deserción. Se analizó si alguno en particular generaba los mejores modelos para todos los cursos. Se estudió si convenía construir modelos separados por curso o bien uno para todo el conjunto de los datos de los cuatro cursos. Como conclusión, se encontró que sí es posible construir modelos predictivos exitosos y que el algoritmo que produjo los mejores modelos fue una red neuronal en tres de los cuatro cursos. Asimismo, resultó mejor el modelo que ajustó cada uno por separado

    A Data Mining Framework for Improving Student Outcomes on Step 1 of the United States Medical Licensing Examination

    Get PDF
    Identifying the factors associated with medical students who fail Step 1 of the United States Medical Licensing Examination (USMLE) has been a focus of investigation for many years. Some researchers believe lower scores on the Medical Colleges Admissions Test (MCAT) are the sole factor used to identify failure. Other researchers believe lower course outcomes during the first two years of medical training are better indicators of failure. Yet, there are medical students who fail Step 1 of the USMLE who enter medical school with high MCAT scores, and conversely medical students with lower academic credentials who are expected to have difficulty passing Step 1 but pass on the first attempt. Researchers have attempted to find the factors associated with Step 1 outcomes; however, there are two problems associated with their methods used. First is the small sample size due to the high national pass rate of Step 1. And second, research using multivariate regression models indicate correlates of Step 1 but does not predict individual student performance. This study used data mining methods to create models which predict medical students at risk of failing Step 1 of the USMLE. Predictor variables include those available to admissions committees at application time, and final grades in courses taken during the preclinical years of medical education. Models were trained, tested, and validated using a stepwise approach, adding predictor variables in the order of courses taken to identify the point during the medical education continuum which best predicts students who will fail Step 1. Oversampling techniques were employed to resolve the problem of small sample sizes. Results of this study suggest at risk medical students can be identified as early as the end of the first term during the first year. The approach used in this study can serve as a framework which if implemented at other U.S. allopathic medical schools can identify students in time for appropriate interventions to impact Step 1 outcome
    corecore