Abstract — This work studies the feasibility of using visual information to automatically measure the engagement level of TV viewers. Previous studies usually utilize expensive and invasive devices (e.g., eye trackers or physiological sensors) in controlled settings. Our work differs by only using an RGB video camera in a naturalistic setting, where viewers move freely and respond naturally and spontaneously. In particular, we recorded 47 people while watching a TV program and manually coded the engagement levels of each viewer. From each video, we extracted several features characterizing facial and head gestures, and used several aggregation methods over a short time window to capture the temporal dynamics of engagement. We report on classification results using the proposed features, and show improved performance over baseline methods that mostly rely on head-pose orientation. indicator of low engagement (top-left image of Fig. 1), it may also be the case that the viewer is commenting the program with another person (top-right). Similarly, while looking at the TV can be a good indicator of high engagement (bottom-left), it may also be the case that the viewer is thinking about something else (bottom-right). The behaviors associated with engagement may manifest differently in each situation and need to be interpreted within specific contexts. Keywords-component; engagement; attention; market research; facial expression analysis; face and head features I
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.