1,496 research outputs found
Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation
Generative models for 3D geometric data arise in many important applications
in 3D computer vision and graphics. In this paper, we focus on 3D deformable
shapes that share a common topological structure, such as human faces and
bodies. Morphable Models and their variants, despite their linear formulation,
have been widely used for shape representation, while most of the recently
proposed nonlinear approaches resort to intermediate representations, such as
3D voxel grids or 2D views. In this work, we introduce a novel graph
convolutional operator, acting directly on the 3D mesh, that explicitly models
the inductive bias of the fixed underlying graph. This is achieved by enforcing
consistent local orderings of the vertices of the graph, through the spiral
operator, thus breaking the permutation invariance property that is adopted by
all the prior work on Graph Neural Networks. Our operator comes by construction
with desirable properties (anisotropic, topology-aware, lightweight,
easy-to-optimise), and by using it as a building block for traditional deep
generative architectures, we demonstrate state-of-the-art results on a variety
of 3D shape datasets compared to the linear Morphable Model and other graph
convolutional operators.Comment: to appear at ICCV 201
Group-level Emotion Recognition using Transfer Learning from Face Identification
In this paper, we describe our algorithmic approach, which was used for
submissions in the fifth Emotion Recognition in the Wild (EmotiW 2017)
group-level emotion recognition sub-challenge. We extracted feature vectors of
detected faces using the Convolutional Neural Network trained for face
identification task, rather than traditional pre-training on emotion
recognition problems. In the final pipeline an ensemble of Random Forest
classifiers was learned to predict emotion score using available training set.
In case when the faces have not been detected, one member of our ensemble
extracts features from the whole image. During our experimental study, the
proposed approach showed the lowest error rate when compared to other explored
techniques. In particular, we achieved 75.4% accuracy on the validation data,
which is 20% higher than the handcrafted feature-based baseline. The source
code using Keras framework is publicly available.Comment: 5 pages, 3 figures, accepted for publication at ICMI17 (EmotiW Grand
Challenge
DCTNet : A Simple Learning-free Approach for Face Recognition
PCANet was proposed as a lightweight deep learning network that mainly
leverages Principal Component Analysis (PCA) to learn multistage filter banks
followed by binarization and block-wise histograming. PCANet was shown worked
surprisingly well in various image classification tasks. However, PCANet is
data-dependence hence inflexible. In this paper, we proposed a
data-independence network, dubbed DCTNet for face recognition in which we adopt
Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is
motivated by the fact that 2D DCT basis is indeed a good approximation for high
ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated
sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is
free from learning as 2D DCT bases can be computed in advance. Besides that, we
also proposed an effective method to regulate the block-wise histogram feature
vector of DCTNet for robustness. It is shown to provide surprising performance
boost when the probe image is considerably different in appearance from the
gallery image. We evaluate the performance of DCTNet extensively on a number of
benchmark face databases and being able to achieve on par with or often better
accuracy performance than PCANet.Comment: APSIPA ASC 201
A Review of Deep Convolutional Neural Networks in Mobile Face Recognition
With the emergence of deep learning, Convolutional Neural Network (CNN) models have been proposed to advance the progress of various applications, including face recognition, object detection, pattern recognition, and number plate recognition. The utilization of CNNs in these areas has considerably improved security and surveillance capabilities by providing automated recognition solutions, such as traffic surveillance, access control devices, biometric security systems, and attendance systems. However, there is still room for improvement in this field. This paper discusses several classic CNN models, such as LeNet-5, AlexNet, VGGNet, GoogLeNet, and ResNet, as well as lightweight models for mobile-based applications, such as MobileNet, ShuffleNet, and EfficientNet. Additionally, deep CNN-based face recognition models, such as DeepFace, DeepID, FaceNet, and SphereFace, are explored, along with their architectural characteristics, advantages, disadvantages, and recognition accuracy. The results indicate that many scholars are researching lightweight face recognition, but applying it to mobile devices is impractical due to high computational costs. Furthermore, noise label learning is not robust in actual scenarios, and unlabeled face learning is expensive in manual labeling. Finally, this paper concludes with a discussion of the current problems faced by face recognition technology and its potential future directions for development
Enhanced Emotion Recognition in Videos: A Convolutional Neural Network Strategy for Human Facial Expression Detection and Classification
The human face is essential in conveying emotions, as facial expressions serve as effective, natural, and universal indicators of emotional states. Automated emotion recognition has garnered increasing interest due to its potential applications in various fields, such as human-computer interaction, machine learning, robotic control, and driver emotional state monitoring. With artificial intelligence and computational power advancements, visual emotion recognition has become a prominent research area. Despite extensive research employing machine learning algorithms like convolutional neural networks (CNN), challenges remain concerning input data processing, emotion classification scope, data size, optimal CNN configurations, and performance evaluation. To address these issues, we propose a comprehensive CNN-based model for real-time detection and classification of five primary emotions: anger, happiness, neutrality, sadness, and surprise. We employ the Amsterdam Dynamic Facial Expression Set – Bath Intensity Variations (ADFES-BIV) video dataset, extracting image frames from the video samples. Image processing techniques such as histogram equalization, color conversion, cropping, and resizing are applied to the frames before labeling. The Viola-Jones algorithm is then used for face detection on the processed grayscale images. We develop and train a CNN on the processed image data, implementing dropout, batch normalization, and L2 regularization to reduce overfitting. The ideal hyperparameters are determined through trial and error, and the model's performance is evaluated. The proposed model achieves a recognition accuracy of 99.38%, with the confusion matrix, recall, precision, F1 score, and processing time further quantifying its performance characteristics. The model's generalization performance is assessed using images from the Warsaw Set of Emotional Facial Expression Pictures (WSEFEP) and Extended Cohn-Kanade Database (CK+) datasets. The results demonstrate the efficiency and usability of our proposed approach, contributing valuable insights into real-time visual emotion recognition
- …