Diagnostic reasoning is a key component of many professions. To improve
students' diagnostic reasoning skills, educational psychologists analyse and
give feedback on epistemic activities used by these students while diagnosing,
in particular, hypothesis generation, evidence generation, evidence evaluation,
and drawing conclusions. However, this manual analysis is highly
time-consuming. We aim to enable the large-scale adoption of diagnostic
reasoning analysis and feedback by automating the epistemic activity
identification. We create the first corpus for this task, comprising diagnostic
reasoning self-explanations of students from two domains annotated with
epistemic activities. Based on insights from the corpus creation and the task's
characteristics, we discuss three challenges for the automatic identification
of epistemic activities using AI methods: the correct identification of
epistemic activity spans, the reliable distinction of similar epistemic
activities, and the detection of overlapping epistemic activities. We propose a
separate performance metric for each challenge and thus provide an evaluation
framework for future research. Indeed, our evaluation of various
state-of-the-art recurrent neural network architectures reveals that current
techniques fail to address some of these challenges