3,422 research outputs found
Facial Beauty Prediction and Analysis based on Deep Convolutional Neural Network: A Review
Abstract: Facial attractiveness or facial beauty prediction (FBP) is a current study that has several potential usages. It is a key difficulty area in the computer vision domain because of the few public databases related to FBP and its experimental trials on the minor-scale database. Moreover, the evaluation of facial beauty is personalized in nature, with people having personalized favor of beauty. Deep learning techniques have displayed a significant ability in terms of analysis and feature representation. The previous studies focussed on scattered portions of facial beauty with fewer comparisons between diverse techniques. Thus, this article reviewed the recent research on computer prediction and analysis of face beauty based on deep convolution neural network DCNN. Furthermore, the provided possible lines of research and challenges in this article can help researchers in advancing the state – of- art in future work
Deep learning based face beauty prediction via dynamic robust losses and ensemble regression
In the last decade, several studies have shown that facial attractiveness can be learned by machines. In this paper, we address Facial Beauty Prediction from static images. The paper contains three main contributions. First, we propose a two-branch architecture (REX-INCEP) based on merging the architecture of two already trained networks to deal with the complicated high-level features associated with the FBP problem. Second, we introduce the use of a dynamic law to control the behaviour of the following robust loss functions during training: ParamSmoothL1, Huber and Tukey. Third, we propose an ensemble regression based on Convolutional Neural Networks (CNNs). In this ensemble, we use both the basic networks and our proposed network (REX-INCEP). The proposed individual CNN regressors are trained with different loss functions, namely MSE, dynamic ParamSmoothL1, dynamic Huber and dynamic Tukey. Our approach is evaluated on the SCUT-FBP5500 database using the two evaluation scenarios provided by the database creators: 60%-40% split and five-fold cross-validation. In both evaluation scenarios, our approach outperforms the state of the art on several metrics. These comparisons highlight the effectiveness of the proposed solutions for FBP. They also show that the proposed dynamic robust losses lead to more flexible and accurate estimators.This work was partially funded by the University of the Basque Country , GUI19/027
CNN based facial aesthetics analysis through dynamic robust losses and ensemble regression
In recent years, estimating beauty of faces has attracted growing interest in the fields of computer vision and machine
learning. This is due to the emergence of face beauty datasets (such as SCUT-FBP, SCUT-FBP5500 and KDEF-PT) and
the prevalence of deep learning methods in many tasks. The goal of this work is to leverage the advances in Deep
Learning architectures to provide stable and accurate face beauty estimation from static face images. To this end, our
proposed approach has three main contributions. To deal with the complicated high-level features associated with the FBP
problem by using more than one pre-trained Convolutional Neural Network (CNN) model, we propose an architecture with
two backbones (2B-IncRex). In addition to 2B-IncRex, we introduce a parabolic dynamic law to control the behavior
of the robust loss parameters during training. These robust losses are ParamSmoothL1, Huber, and Tukey. As a third
contribution, we propose an ensemble regression based on five regressors, namely Resnext-50, Inception-v3 and three
regressors based on our proposed 2B-IncRex architecture. These models are trained with the following dynamic loss
functions: Dynamic ParamSmoothL1, Dynamic Tukey, Dynamic ParamSmoothL1, Dynamic Huber, and Dynamic Tukey,
respectively. To evaluate the performance of our approach, we used two datasets: SCUT-FBP5500 and KDEF-PT. The
dataset SCUT-FBP5500 contains two evaluation scenarios provided by the database developers: 60-40% split and five-
fold cross-validation. Our approach outperforms state-of-the-art methods on several metrics in both evaluation scenarios of
SCUT-FBP5500. Moreover, experiments on the KDEF-PT dataset demonstrate the efficiency of our approach for estimating
facial beauty using transfer learning, despite the presence of facial expressions and limited data. These comparisons highlight
the effectiveness of the proposed solutions for FBP. They also show that the proposed Dynamic robust losses lead to more
flexible and accurate estimators.Open Access funding provided thanks to the CRUE-CSIC
agreement with Springer Nature
Recommended from our members
Automatic Prediction of Impressions in Time and across Varying Context: Personality, Attractiveness and Likeability
© 2010-2012 IEEE. In this paper, we propose a novel multimodal framework for automatically predicting the impressions of extroversion, agreeableness, conscientiousness, neuroticism , openness, attractiveness and likeability continuously in time and across varying situational contexts. Differently from the existing works, we obtain visual-only and audio-only annotations continuously in time for the same set of subjects, for the first time in the literature, and compare them to their audio-visual annotations. We propose a time-continuous prediction approach that learns the temporal relationships rather than treating each time instant separately. Our experiments show that the best prediction results are obtained when regression models are learned from audio-visual annotations and visual cues, and from audio-visual annotations and visual cues combined with audio cues at the decision level. Continuously generated annotations have the potential to provide insight into better understanding which impressions can be formed and predicted more dynamically, varying with situational context, and which ones appear to be more static and stable over time.This research work was supported by the EPSRC MAPTRAITS Project (Grant Ref: EP/K017500/1) and the EPSRC HARPS Project under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref: EP/L00416X/1)
Face Emotion Recognition Based on Machine Learning: A Review
Computers can now detect, understand, and evaluate emotions thanks to recent developments in machine learning and information fusion. Researchers across various sectors are increasingly intrigued by emotion identification, utilizing facial expressions, words, body language, and posture as means of discerning an individual's emotions. Nevertheless, the effectiveness of the first three methods may be limited, as individuals can consciously or unconsciously suppress their true feelings. This article explores various feature extraction techniques, encompassing the development of machine learning classifiers like k-nearest neighbour, naive Bayesian, support vector machine, and random forest, in accordance with the established standard for emotion recognition. The paper has three primary objectives: firstly, to offer a comprehensive overview of effective computing by outlining essential theoretical concepts; secondly, to describe in detail the state-of-the-art in emotion recognition at the moment; and thirdly, to highlight important findings and conclusions from the literature, with an emphasis on important obstacles and possible future paths, especially in the creation of state-of-the-art machine learning algorithms for the identification of emotions
Beat-Event Detection in Action Movie Franchises
While important advances were recently made towards temporally localizing and
recognizing specific human actions or activities in videos, efficient detection
and classification of long video chunks belonging to semantically defined
categories such as "pursuit" or "romance" remains challenging.We introduce a
new dataset, Action Movie Franchises, consisting of a collection of Hollywood
action movie franchises. We define 11 non-exclusive semantic categories -
called beat-categories - that are broad enough to cover most of the movie
footage. The corresponding beat-events are annotated as groups of video shots,
possibly overlapping.We propose an approach for localizing beat-events based on
classifying shots into beat-categories and learning the temporal constraints
between shots. We show that temporal constraints significantly improve the
classification performance. We set up an evaluation protocol for beat-event
localization as well as for shot classification, depending on whether movies
from the same franchise are present or not in the training data
- …