2 research outputs found

    Model to automate the dropout prediction process in university students in the first year of study

    No full text
    La presente investigación propone un modelo para la automatización de predicción de la deserción de estudiantes universitarios. Esta investigación surge de una problemática existente en el sector educativo peruano: la deserción estudiantil universitaria; es decir, aquellos estudiantes universitarios que abandonan sus estudios de forma parcial o definitiva. La investigación tiene por finalidad brindar una solución que contribuya a reducir la tasa de deserción universitaria, aplicando tecnologías de análisis predictivo y minería de datos, que detecte anticipadamente a estudiantes con posibilidades de abandonar sus estudios, brindando así a las instituciones educativas mayor visibilidad y oportunidades de acción ante esta problemática. Se diseñó un modelo de análisis predictivo, en base al análisis y definición de 15 variables de predicción, 3 fases y la aplicación de algoritmos de predicción, basados en la disciplina del Educational Data Minig (EDM) y soportada por la plataforma IBM SPSS Modeler. Para validar, se evaluó la aplicación de 4 algoritmos de predicción: árboles de decisión, redes bayesianas, regresión lineal y redes neuronales; en un estudio en una institución universitaria de Lima. Los resultados indican que las redes bayesianas se comportan mejor que otros algoritmos, comparados bajo las métricas de precisión, exactitud, especificidad y tasa de error. Particularmente, la precisión de las redes bayesianas alcanza un 67.10% mientras que para los árboles de decisión (el segundo mejor algoritmo) es de un 61,92% en la muestra de entrenamiento para la iteración con razón de 8:2. Además, las variables “persona deportista” (0,29%), “vivienda propia” (0,20%) y “calificaciones de preparatoria” (0,15%) son las que más contribuyen al modelo de predicción.This research proposes a model for the automation of prediction of university student dropout. This research arises from an existing problem in the Peruvian educational sector: university student dropout; that is, those university students who partially or permanently abandon their studies. The purpose of the research is to provide a solution that contributes to reducing the university dropout rate, applying predictive analysis technologies and data mining, which detects in advance students with the possibility of dropping out of their studies, thus providing educational institutions with greater visibility and opportunities. of action before this problem. A predictive analysis model was designed, based on the analysis and definition of 15 prediction variables, 3 phases and the application of prediction algorithms, based on the Educational Data Mining (EDM) discipline and supported by the IBM SPSS Modeler platform. To validate, the application of 4 prediction algorithms was evaluated: decision trees, Bayesian networks, linear regression, and neural networks; in a study at a university institution in Lima. The results indicate that Bayesian networks perform better than other algorithms, compared under the metrics of precision, accuracy, specificity, and error rate. Particularly, the precision of Bayesian networks reaches 67.10% while for decision trees (the second-best algorithm) it is 61.92% in the training sample for the iteration with a ratio of 8: 2. In addition, the variables "sports person" (0.29%), "own home" (0.20%) and "high school grades" (0.15%) are the ones that contribute the most to the prediction model.Tesi

    Predictive model to reduce the dropout rate of university students in Perú: Bayesian Networks vs. Decision Trees

    No full text
    El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado.This research proposes a prediction model that might help reducing the dropout rate of university students in Peru. For this, a three-phase predictive analysis model was designed which was combined with the stages proposed by the IBM SPSS Modeler methodology. Bayesian network techniques was compared with decision trees for their level of accuracy over other algorithms in an Educational Data Mining (EDM) scenario. Data were collected from 500 undergraduate students from a private university in Lima. The results indicate that Bayesian networks behave better than decision trees based on metrics of precision, accuracy, specificity, and error rate. Particularly, the accuracy of Bayesian networks reaches 67.10% while the accuracy for decision trees is 61.92% in the training sample for iteration with 8:2 rate. On the other hand, the variables athletic person (0.30%), own house (0.21%), and high school grades (0.13%) are the ones that contribute most to the prediction model for both Bayesian networks and decision trees
    corecore