15 research outputs found
Island Loss for Learning Discriminative Features in Facial Expression Recognition
Over the past few years, Convolutional Neural Networks (CNNs) have shown
promise on facial expression recognition. However, the performance degrades
dramatically under real-world settings due to variations introduced by subtle
facial appearance changes, head pose variations, illumination changes, and
occlusions.
In this paper, a novel island loss is proposed to enhance the discriminative
power of the deeply learned features. Specifically, the IL is designed to
reduce the intra-class variations while enlarging the inter-class differences
simultaneously. Experimental results on four benchmark expression databases
have demonstrated that the CNN with the proposed island loss (IL-CNN)
outperforms the baseline CNN models with either traditional softmax loss or the
center loss and achieves comparable or better performance compared with the
state-of-the-art methods for facial expression recognition.Comment: 8 pages, 3 figure
Mode Variational LSTM Robust to Unseen Modes of Variation: Application to Facial Expression Recognition
Spatio-temporal feature encoding is essential for encoding the dynamics in
video sequences. Recurrent neural networks, particularly long short-term memory
(LSTM) units, have been popular as an efficient tool for encoding
spatio-temporal features in sequences. In this work, we investigate the effect
of mode variations on the encoded spatio-temporal features using LSTMs. We show
that the LSTM retains information related to the mode variation in the
sequence, which is irrelevant to the task at hand (e.g. classification facial
expressions). Actually, the LSTM forget mechanism is not robust enough to mode
variations and preserves information that could negatively affect the encoded
spatio-temporal features. We propose the mode variational LSTM to encode
spatio-temporal features robust to unseen modes of variation. The mode
variational LSTM modifies the original LSTM structure by adding an additional
cell state that focuses on encoding the mode variation in the input sequence.
To efficiently regulate what features should be stored in the additional cell
state, additional gating functionality is also introduced. The effectiveness of
the proposed mode variational LSTM is verified using the facial expression
recognition task. Comparative experiments on publicly available datasets
verified that the proposed mode variational LSTM outperforms existing methods.
Moreover, a new dynamic facial expression dataset with different modes of
variation, including various modes like pose and illumination variations, was
collected to comprehensively evaluate the proposed mode variational LSTM.
Experimental results verified that the proposed mode variational LSTM encodes
spatio-temporal features robust to unseen modes of variation.Comment: Accepted in AAAI-1
An Occam’s Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets
International audienceThis paper presents a light-weight and accurate deep neural model for audiovisual emotion recognition. To design this model, the authors followed a philosophy of simplicity, drastically limiting the number of parameters to learn from the target datasets, always choosing the simplest earning methods: i) transfer learning and low-dimensional space embedding allows to reduce the dimensionality of the representations. ii) The isual temporal information is handled by a simple score-per-frame selection process, averaged across time. iii) A simple frame selection echanism is also proposed to weight the images of a sequence. iv) The fusion of the different modalities is performed at prediction level (late usion). We also highlight the inherent challenges of the AFEW dataset and the difficulty of model selection with as few as 383 validation equences. The proposed real-time emotion classifier achieved a state-of-the-art accuracy of 60.64 % on the test set of AFEW, and ranked 4th at he Emotion in the Wild 2018 challenge
The Theoretical and Methodological Opportunities Afforded by Guided Play With Young Children
For infants and young children, learning takes place all the time and everywhere. How children learn best both in and out of school has been a long-standing topic of debate in education, cognitive development, and cognitive science. Recently, guided play has been proposed as an integrative approach for thinking about learning as a child-led, adult-assisted playful activity. The interactive and dynamic nature of guided play presents theoretical and methodological challenges and opportunities. Drawing upon research from multiple disciplines, we discuss the integration of cutting-edge computational modeling and data science tools to address some of these challenges, and highlight avenues toward an empirically grounded, computationally precise and ecologically valid framework of guided play in early education