We introduce a wearable single-eye emotion recognition device and a real-time
approach to recognizing emotions from partial observations of an emotion that
is robust to changes in lighting conditions. At the heart of our method is a
bio-inspired event-based camera setup and a newly designed lightweight Spiking
Eye Emotion Network (SEEN). Compared to conventional cameras, event-based
cameras offer a higher dynamic range (up to 140 dB vs. 80 dB) and a higher
temporal resolution. Thus, the captured events can encode rich temporal cues
under challenging lighting conditions. However, these events lack texture
information, posing problems in decoding temporal information effectively. SEEN
tackles this issue from two different perspectives. First, we adopt
convolutional spiking layers to take advantage of the spiking neural network's
ability to decode pertinent temporal information. Second, SEEN learns to
extract essential spatial cues from corresponding intensity frames and
leverages a novel weight-copy scheme to convey spatial attention to the
convolutional spiking layers during training and inference. We extensively
validate and demonstrate the effectiveness of our approach on a specially
collected Single-eye Event-based Emotion (SEE) dataset. To the best of our
knowledge, our method is the first eye-based emotion recognition method that
leverages event-based cameras and spiking neural network