10,779 research outputs found
Island Loss for Learning Discriminative Features in Facial Expression Recognition
Over the past few years, Convolutional Neural Networks (CNNs) have shown
promise on facial expression recognition. However, the performance degrades
dramatically under real-world settings due to variations introduced by subtle
facial appearance changes, head pose variations, illumination changes, and
occlusions.
In this paper, a novel island loss is proposed to enhance the discriminative
power of the deeply learned features. Specifically, the IL is designed to
reduce the intra-class variations while enlarging the inter-class differences
simultaneously. Experimental results on four benchmark expression databases
have demonstrated that the CNN with the proposed island loss (IL-CNN)
outperforms the baseline CNN models with either traditional softmax loss or the
center loss and achieves comparable or better performance compared with the
state-of-the-art methods for facial expression recognition.Comment: 8 pages, 3 figure
Multi-Conditional Latent Variable Model for Joint Facial Action Unit Detection
We propose a novel multi-conditional latent variable model for simultaneous facial feature fusion and detection of facial action units. In our approach we exploit the structure-discovery capabilities of generative models such as Gaussian processes, and the discriminative power of classifiers such as logistic function. This leads to superior performance compared to existing classifiers for the target task that exploit either the discriminative or generative property, but not both. The model learning is performed via an efficient, newly proposed Bayesian learning strategy based on Monte Carlo sampling. Consequently, the learned model is robust to data overfitting, regardless of the number of both input features and jointly estimated facial action units. Extensive qualitative and quantitative experimental evaluations are performed on three publicly available datasets (CK+, Shoulder-pain and DISFA). We show that the proposed model outperforms the state-of-the-art methods for the target task on (i) feature fusion, and (ii) multiple facial action unit detection
Robust correlated and individual component analysis
© 1979-2012 IEEE.Recovering correlated and individual components of two, possibly temporally misaligned, sets of data is a fundamental task in disciplines such as image, vision, and behavior computing, with application to problems such as multi-modal fusion (via correlated components), predictive analysis, and clustering (via the individual ones). Here, we study the extraction of correlated and individual components under real-world conditions, namely i) the presence of gross non-Gaussian noise and ii) temporally misaligned data. In this light, we propose a method for the Robust Correlated and Individual Component Analysis (RCICA) of two sets of data in the presence of gross, sparse errors. We furthermore extend RCICA in order to handle temporal incongruities arising in the data. To this end, two suitable optimization problems are solved. The generality of the proposed methods is demonstrated by applying them onto 4 applications, namely i) heterogeneous face recognition, ii) multi-modal feature fusion for human behavior analysis (i.e., audio-visual prediction of interest and conflict), iii) face clustering, and iv) thetemporal alignment of facial expressions. Experimental results on 2 synthetic and 7 real world datasets indicate the robustness and effectiveness of the proposed methodson these application domains, outperforming other state-of-the-art methods in the field
Automatic analysis of facial actions: a survey
As one of the most comprehensive and objective ways to describe facial expressions, the Facial Action Coding System (FACS) has recently received significant attention. Over the past 30 years, extensive research has been conducted by psychologists and neuroscientists on various aspects of facial expression analysis using FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions. Such an automated process can also potentially increase the reliability, precision and temporal resolution of coding. This paper provides a comprehensive survey of research into machine analysis of facial actions. We systematically review all components of such systems: pre-processing, feature extraction and machine coding of facial actions. In addition, the existing FACS-coded facial expression databases are summarised. Finally, challenges that have to be addressed to make automatic facial action analysis applicable in real-life situations are extensively discussed. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the future of machine recognition of facial actions: what are the challenges and opportunities that researchers in the field face
Facial Expression Analysis via Transfer Learning
Automated analysis of facial expressions has remained an interesting and challenging research topic in the field of computer vision and pattern recognition due to vast applications such as human-machine interface design, social robotics, and developmental psychology. This dissertation focuses on developing and applying transfer learning algorithms - multiple kernel learning (MKL) and multi-task learning (MTL) - to resolve the problems of facial feature fusion and the exploitation of multiple facial action units (AUs) relations in designing robust facial expression recognition systems. MKL algorithms are employed to fuse multiple facial features with different kernel functions and tackle the domain adaption problem at the kernel level within support vector machines (SVM). lp-norm is adopted to enforce both sparse and nonsparse kernel combination in our methods. We further develop and apply MTL algorithms for simultaneous detection of multiple related AUs by exploiting their inter-relationships. Three variants of task structure models are designed and investigated to obtain fine depiction of AU relations. lp-norm MTMKL and TD-MTMKL (Task-Dependent MTMKL) are group-sensitive MTL methodsthat model the co-occurrence relations among AUs. On the other hand, our proposed hierarchical multi-task structural learning (HMTSL) includes a latent layer to learn a hierarchical structure to exploit all possible AU interrelations for AU detection. Extensive experiments on public face databases show that our proposed transfer learning methods have produced encouraging results compared to several state-of-the-art methods for facial expression recognition and AU detection
- …