Augmenting reality via head-mounted displays (HMD-AR) is an emerging technology in
education. The interactivity provided by HMD-AR devices is particularly promising for learning, but
presents a challenge to human activity recognition, especially with children. Recent technological
advances regarding speech and gesture recognition concerning Microsoft’s HoloLens 2 may address
this prevailing issue. In a within-subjects study with 47 elementary school children (2nd to 6th grade),
we examined the usability of the HoloLens 2 using a standardized tutorial on multimodal interaction
in AR. The overall system usability was rated “good”. However, several behavioral metrics indicated
that specific interaction modes differed in their efficiency. The results are of major importance
for the development of learning applications in HMD-AR as they partially deviate from previous
findings. In particular, the well-functioning recognition of children’s voice commands that we
observed represents a novelty. Furthermore, we found different interaction preferences in HMD-AR
among the children. We also found the use of HMD-AR to have a positive effect on children’s
activity-related achievement emotions. Overall, our findings can serve as a basis for determining
general requirements, possibilities, and limitations of the implementation of educational HMD-AR
environments in elementary school classrooms