Machine Learning for the Educational Sciences

Abstract

Machine learning (ML) provides a powerful framework for the analysis of high-dimensional datasets by modelling complex relationships, often encountered in modern data with many variables, cases and potentially non-linear effects. The impact of ML methods on research and practical applications in the educational sciences is still limited, but continuously grows, as larger and more complex datasets become available through massive open online courses (MOOCs) and large-scale investigations. The educational sciences are at a crucial pivot point, because of the anticipated impact ML methods hold for the field. To provide educational researchers with an elaborate introduction to the topic, we provide an instructional summary of the opportunities and challenges of ML for the educational sciences, show how a look at related disciplines can help learning from their experiences, and argue for a philosophical shift in model evaluation. We demonstrate how the overall quality of data analysis in educational research can benefit from these methods and show how ML can play a decisive role in the validation of empirical models. Specifically, we (1) provide an overview of the types of data suitable for ML and (2) give practical advice for the application of ML methods. In each section, we provide analytical examples and reproducible R code. Also, we provide an extensive Appendix on ML-based applications for education. This instructional summary will help educational scientists and practitioners to prepare for the promises and threats that come with the shift towards digitisation and large-scale assessment in education

    Similar works