An integral part of seamless human-human communication is engagement, the
process by which two or more participants establish, maintain, and end their
perceived connection. Therefore, to develop successful human-centered
human-machine interaction applications, automatic engagement inference is one
of the tasks required to achieve engaging interactions between humans and
machines, and to make machines attuned to their users, hence enhancing user
satisfaction and technology acceptance. Several factors contribute to
engagement state inference, which include the interaction context and
interactants' behaviours and identity. Indeed, engagement is a multi-faceted
and multi-modal construct that requires high accuracy in the analysis and
interpretation of contextual, verbal and non-verbal cues. Thus, the development
of an automated and intelligent system that accomplishes this task has been
proven to be challenging so far. This paper presents a comprehensive survey on
previous work in engagement inference for human-machine interaction, entailing
interdisciplinary definition, engagement components and factors, publicly
available datasets, ground truth assessment, and most commonly used features
and methods, serving as a guide for the development of future human-machine
interaction interfaces with reliable context-aware engagement inference
capability. An in-depth review across embodied and disembodied interaction
modes, and an emphasis on the interaction context of which engagement
perception modules are integrated sets apart the presented survey from existing
surveys