Deep learning models for learning analytics have become increasingly popular
over the last few years; however, these approaches are still not widely adopted
in real-world settings, likely due to a lack of trust and transparency. In this
paper, we tackle this issue by implementing explainable AI methods for
black-box neural networks. This work focuses on the context of online and
blended learning and the use case of student success prediction models. We use
a pairwise study design, enabling us to investigate controlled differences
between pairs of courses. Our analyses cover five course pairs that differ in
one educationally relevant aspect and two popular instance-based explainable AI
methods (LIME and SHAP). We quantitatively compare the distances between the
explanations across courses and methods. We then validate the explanations of
LIME and SHAP with 26 semi-structured interviews of university-level educators
regarding which features they believe contribute most to student success, which
explanations they trust most, and how they could transform these insights into
actionable course design decisions. Our results show that quantitatively,
explainers significantly disagree with each other about what is important, and
qualitatively, experts themselves do not agree on which explanations are most
trustworthy. All code, extended results, and the interview protocol are
provided at https://github.com/epfl-ml4ed/trusting-explainers.Comment: Accepted as a full paper at LAK 2023: The 13th International Learning
Analytics and Knowledge Conference, March 13-17, 2023, Arlington, Texas, US