Many educational institutions have been using online judges
in programming classes, amongst others, to provide faster feedback for
students and to reduce the teacher’s workload. There is some evidence
that online judges also help in reducing dropout. Nevertheless, there
is still a high level of dropout noticeable in introductory programming
classes. In this sense, the objective of this work is to develop and validate
a method for predicting student dropout using data from the first two
weeks of study, to allow for early intervention. Instead of the classical
questionnaire-based method, we opted for a non-subjective, data-driven
approach. However, such approaches are known to suffer from a potential
overload of factors, which may not all be relevant to the prediction task.
As a result, we reached a very promising 80% of accuracy, and performed
explicit extraction of the main factors leading to student dropout