324 research outputs found
Learning Grimaces by Watching TV
Differently from computer vision systems which require explicit supervision,
humans can learn facial expressions by observing people in their environment.
In this paper, we look at how similar capabilities could be developed in
machine vision. As a starting point, we consider the problem of relating facial
expressions to objectively measurable events occurring in videos. In
particular, we consider a gameshow in which contestants play to win significant
sums of money. We extract events affecting the game and corresponding facial
expressions objectively and automatically from the videos, obtaining large
quantities of labelled data for our study. We also develop, using benchmarks
such as FER and SFEW 2.0, state-of-the-art deep neural networks for facial
expression recognition, showing that pre-training on face verification data can
be highly beneficial for this task. Then, we extend these models to use facial
expressions to predict events in videos and learn nameable expressions from
them. The dataset and emotion recognition models are available at
http://www.robots.ox.ac.uk/~vgg/data/facevalueComment: British Machine Vision Conference (BMVC) 201
Emotion Recognition in the Wild using Deep Neural Networks and Bayesian Classifiers
Group emotion recognition in the wild is a challenging problem, due to the
unstructured environments in which everyday life pictures are taken. Some of
the obstacles for an effective classification are occlusions, variable lighting
conditions, and image quality. In this work we present a solution based on a
novel combination of deep neural networks and Bayesian classifiers. The neural
network works on a bottom-up approach, analyzing emotions expressed by isolated
faces. The Bayesian classifier estimates a global emotion integrating top-down
features obtained through a scene descriptor. In order to validate the system
we tested the framework on the dataset released for the Emotion Recognition in
the Wild Challenge 2017. Our method achieved an accuracy of 64.68% on the test
set, significantly outperforming the 53.62% competition baseline.Comment: accepted by the Fifth Emotion Recognition in the Wild (EmotiW)
Challenge 201
EmotiW 2018: Audio-Video, Student Engagement and Group-Level Affect Prediction
This paper details the sixth Emotion Recognition in the Wild (EmotiW)
challenge. EmotiW 2018 is a grand challenge in the ACM International Conference
on Multimodal Interaction 2018, Colorado, USA. The challenge aims at providing
a common platform to researchers working in the affective computing community
to benchmark their algorithms on `in the wild' data. This year EmotiW contains
three sub-challenges: a) Audio-video based emotion recognition; b) Student
engagement prediction; and c) Group-level emotion recognition. The databases,
protocols and baselines are discussed in detail
Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition
The audio-video based emotion recognition aims to classify a given video into
basic emotions. In this paper, we describe our approaches in EmotiW 2019, which
mainly explores emotion features and feature fusion strategies for audio and
visual modality. For emotion features, we explore audio feature with both
speech-spectrogram and Log Mel-spectrogram and evaluate several facial features
with different CNN models and different emotion pretrained strategies. For
fusion strategies, we explore intra-modal and cross-modal fusion methods, such
as designing attention mechanisms to highlights important emotion feature,
exploring feature concatenation and factorized bilinear pooling (FBP) for
cross-modal feature fusion. With careful evaluation, we obtain 65.5% on the
AFEW validation set and 62.48% on the test set and rank third in the challenge.Comment: Accepted by ACM ICMI'19 (2019 International Conference on Multimodal
Interaction
- …