In recent years, the educational data mining and user modeling communities have been aggressively introducing models for predicting student performance on external measures such as standardized tests as well as within-tutor performance. While these models have brought statistically reliable improvement to performance prediction, the real world significance of the differences in errors has been largely unexplored. In this paper we take a deeper look at what reported errors actually mean in the context of high stakes test score prediction as well as student mastery prediction. We report how differences in common error and accuracy metrics on prediction tasks translate to impact on students and depict how standard validation methods can lead to overestimated accuracies in these prediction tasks. Two years of student tutor use and corresponding student state test scores are used for the analysis of test prediction while a simulation study is conducted to investigate the correspondence between performance prediction error and latent knowledge prediction
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.