Recognizing facial activity is a well-understood (but non-trivial) computer
vision problem. However, reliable solutions require a camera with a good view
of the face, which is often unavailable in wearable settings. Furthermore, in
wearable applications, where systems accompany users throughout their daily
activities, a permanently running camera can be problematic for privacy (and
legal) reasons. This work presents an alternative solution based on the fusion
of wearable inertial sensors, planar pressure sensors, and acoustic
mechanomyography (muscle sounds). The sensors were placed unobtrusively in a
sports cap to monitor facial muscle activities related to facial expressions.
We present our integrated wearable sensor system, describe data fusion and
analysis methods, and evaluate the system in an experiment with thirteen
subjects from different cultural backgrounds (eight countries) and both sexes
(six women and seven men). In a one-model-per-user scheme and using a late
fusion approach, the system yielded an average F1 score of 85.00% for the case
where all sensing modalities are combined. With a cross-user validation and a
one-model-for-all-user scheme, an F1 score of 79.00% was obtained for thirteen
participants (six females and seven males). Moreover, in a hybrid fusion
(cross-user) approach and six classes, an average F1 score of 82.00% was
obtained for eight users. The results are competitive with state-of-the-art
non-camera-based solutions for a cross-user study. In addition, our unique set
of participants demonstrates the inclusiveness and generalizability of the
approach.Comment: Submitted to Information Fusion, Elsevie