Search CORE

15 research outputs found

Island Loss for Learning Discriminative Features in Facial Expression Recognition

Author: Cai Jie
Khan Ahmed Shehab
Li Zhiyuan
Meng Zibo
O'Reilly James
Tong Yan
Publication venue
Publication date: 23/10/2017
Field of study

Over the past few years, Convolutional Neural Networks (CNNs) have shown promise on facial expression recognition. However, the performance degrades dramatically under real-world settings due to variations introduced by subtle facial appearance changes, head pose variations, illumination changes, and occlusions. In this paper, a novel island loss is proposed to enhance the discriminative power of the deeply learned features. Specifically, the IL is designed to reduce the intra-class variations while enlarging the inter-class differences simultaneously. Experimental results on four benchmark expression databases have demonstrated that the CNN with the proposed island loss (IL-CNN) outperforms the baseline CNN models with either traditional softmax loss or the center loss and achieves comparable or better performance compared with the state-of-the-art methods for facial expression recognition.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Mode Variational LSTM Robust to Unseen Modes of Variation: Application to Facial Expression Recognition

Author: Baddar Wissam J.
Ro Yong Man
Publication venue
Publication date: 16/11/2018
Field of study

Spatio-temporal feature encoding is essential for encoding the dynamics in video sequences. Recurrent neural networks, particularly long short-term memory (LSTM) units, have been popular as an efficient tool for encoding spatio-temporal features in sequences. In this work, we investigate the effect of mode variations on the encoded spatio-temporal features using LSTMs. We show that the LSTM retains information related to the mode variation in the sequence, which is irrelevant to the task at hand (e.g. classification facial expressions). Actually, the LSTM forget mechanism is not robust enough to mode variations and preserves information that could negatively affect the encoded spatio-temporal features. We propose the mode variational LSTM to encode spatio-temporal features robust to unseen modes of variation. The mode variational LSTM modifies the original LSTM structure by adding an additional cell state that focuses on encoding the mode variation in the input sequence. To efficiently regulate what features should be stored in the additional cell state, additional gating functionality is also introduced. The effectiveness of the proposed mode variational LSTM is verified using the facial expression recognition task. Comparative experiments on publicly available datasets verified that the proposed mode variational LSTM outperforms existing methods. Moreover, a new dynamic facial expression dataset with different modes of variation, including various modes like pose and illumination variations, was collected to comprehensively evaluate the proposed mode variational LSTM. Experimental results verified that the proposed mode variational LSTM encodes spatio-temporal features robust to unseen modes of variation.Comment: Accepted in AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An Occam’s Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets

Author: Jurie Frédéric
Kervadec Corentin
Lechervy Alexis
Pateux Stéphane
Vielzeuf Valentin
Publication venue: HAL CCSD
Publication date: 08/08/2018
Field of study

International audienceThis paper presents a light-weight and accurate deep neural model for audiovisual emotion recognition. To design this model, the authors followed a philosophy of simplicity, drastically limiting the number of parameters to learn from the target datasets, always choosing the simplest earning methods: i) transfer learning and low-dimensional space embedding allows to reduce the dimensionality of the representations. ii) The isual temporal information is handled by a simple score-per-frame selection process, averaged across time. iii) A simple frame selection echanism is also proposed to weight the images of a sequence. iv) The fusion of the different modalities is performed at prediction level (late usion). We also highlight the inherent challenges of the AFEW dataset and the difficulty of model selection with as few as 383 validation equences. The proposed real-time emotion classifier achieved a state-of-the-art accuracy of 60.64 % on the test set of AFEW, and ranked 4th at he Emotion in the Wild 2018 challenge

HAL - Normandie Université

arXiv.org e-Print Archive

Crossref

The Theoretical and Methodological Opportunities Afforded by Guided Play With Young Children

Author: Elizabeth Bonawitz
Fei Xu
Kathleen H. Corriveau
Kathy Hirsh-Pasek
Patrick Shafto
Roberta M. Golinkoff
Scott C.-H. Yang
Yue Yu
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

For infants and young children, learning takes place all the time and everywhere. How children learn best both in and out of school has been a long-standing topic of debate in education, cognitive development, and cognitive science. Recently, guided play has been proposed as an integrative approach for thinking about learning as a child-led, adult-assisted playful activity. The interactive and dynamic nature of guided play presents theoretical and methodological challenges and opportunities. Drawing upon research from multiple disciplines, we discuss the integration of cutting-edge computational modeling and data science tools to address some of these challenges, and highlight avenues toward an empirically grounded, computationally precise and ecologically valid framework of guided play in early education

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

Frontiers - Publisher Connector