55,408 research outputs found
Face analysis using curve edge maps
This paper proposes an automatic and real-time system for face analysis, usable in visual communication applications. In this approach, faces are represented with Curve Edge Maps, which are collections of polynomial segments with a convex region. The segments are extracted from edge pixels using an adaptive incremental linear-time fitting algorithm, which is based on constructive polynomial fitting. The face analysis system considers face tracking, face recognition and facial feature detection, using Curve Edge Maps driven by histograms of intensities and histograms of relative positions. When applied to different face databases and video sequences, the average face recognition rate is 95.51%, the average facial feature detection rate is 91.92% and the accuracy in location of the facial features is 2.18% in terms of the size of the face, which is comparable with or better than the results in literature. However, our method has the advantages of simplicity, real-time performance and extensibility to the different aspects of face analysis, such as recognition of facial expressions and talking
Feature fusion for facial landmark detection: A feature descriptors combination approach
Facial landmark detection is a crucial first step in facial analysis for biometrics and numerous other applications. However, it has proved to be a very challenging task due to the numerous sources of variation in 2D and 3D facial data. Although landmark detection based on descriptors of the 2D and 3D appearance of the face has been extensively studied, the fusion of such feature descriptors is a relatively under-studied issue. In this report, a novel generalized framework for combining facial feature descriptors is presented, and several feature fusion schemes are proposed and evaluated. The proposed framework maps each feature into a similarity score, combines the individual similarity scores into a resultant score, used to select the optimal solution for a queried landmark. The evaluation of the proposed fusion schemes for facial landmark detection clearly indicates that a quadratic distance to similarity mapping in conjunction with a root mean square rule for similarity fusion achieves the best performance in accuracy, efficiency, robustness and monotonicity
Multimodal Polynomial Fusion for Detecting Driver Distraction
Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone.
Although there has been a considerable amount of research on modeling the
distracted behavior of drivers under various conditions, accurate automatic
detection using multiple modalities and especially the contribution of using
the speech modality to improve accuracy has received little attention. This
paper introduces a new multimodal dataset for distracted driving behavior and
discusses automatic distraction detection using features from three modalities:
facial expression, speech and car signals. Detailed multimodal feature analysis
shows that adding more modalities monotonically increases the predictive
accuracy of the model. Finally, a simple and effective multimodal fusion
technique using a polynomial fusion layer shows superior distraction detection
results compared to the baseline SVM and neural network models.Comment: INTERSPEECH 201
Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
Here, we develop an audiovisual deep residual network for multimodal apparent
personality trait recognition. The network is trained end-to-end for predicting
the Big Five personality traits of people from their videos. That is, the
network does not require any feature engineering or visual analysis such as
face detection, face landmark alignment or facial expression recognition.
Recently, the network won the third place in the ChaLearn First Impressions
Challenge with a test accuracy of 0.9109
Deception/Truthful Prediction Based on Facial Feature and Machine Learning Analysis
The Automatic Deception detection refers to the investigative practices used to determine whether person is telling you Truth or lie. Automatic deception detection has been studied extensively as it can be useful in many real-life scenarios in health, justice, and security systems. Many psychological studies have been reported for deception detection. Polygraph testing is a current trending technique to detect deception, but it requires human intervention and training. In recent times, many machine learning based approaches have been applied to detect deceptions. Various modalities like Thermal Imaging, Brain Activity Mapping, Acoustic analysis, eye tracking. Facial Micro expression processing and linguistic analyses are used to detect deception. Machine learning techniques based on facial feature analysis look like a promising path for automatic deception detection. It also works without human intervention. So, it may give better results because it does not affect race or ethnicity. Moreover, one can do covert operation to find deceit using facial video recording. Covert Operation may capture the real personality of deceptive persons. By making combination of various facial features like Facial Emotion, Facial Micro Expressions and Eye blink rate, pupil size, Facial Action Units we can get better accuracy in Deception Detection
Efficient Human Facial Pose Estimation
Pose estimation has become an increasingly important area in computer vision and more specifically in human facial recognition and activity recognition for surveillance applications. Pose estimation is a process by which the rotation, pitch, or yaw of a human head is determined. Numerous methods already exist which can determine the angular change of a face, however, these methods vary in accuracy and their computational requirements tend to be too high for real-time applications. The objective of this thesis is to develop a method for pose estimation, which is computationally efficient, while still maintaining a reasonable degree of accuracy. In this thesis, a feature-based method is presented to determine the yaw angle of a human facial pose using a combination of artificial neural networks and template matching. The artificial neural networks are used for the feature detection portion of the algorithm along with skin detection and other image enhancement algorithms. The first head model, referred to as the Frontal Position Model, determines the pose of the face using two eyes and the mouth. The second model, referred to as the Side Position Model, is used when only one eye can be viewed and determines pose based on a single eye, the nose tip, and the mouth. The two models are presented to demonstrate the position change of facial features due to pose and to provide the means to determine the pose as these features change from the frontal position. The effectiveness of this pose estimation method is examined by looking at both the manual and automatic feature detection methods. Analysis is further performed on how errors in feature detection affect the resulting pose determination. The method resulted in the detection of facial pose from 30 to -30 degrees with an average error of 4.28 degrees for the Frontal Position Model and 5.79 degrees for the Side Position Model with correct feature detection. The Intel(R) Streaming SIMD Extensions (SSE) technology was employed to enhance the performance of floating point operations. The neural networks used in the feature detection process require a large amount of floating point calculations, due to the computation of the image data with weights and biases. With SSE optimization the algorithm becomes suitable for processing images in a real-time environment. The method is capable of determining features and estimating the pose at a rate of seven frames per second on a 1.8 GHz Pentium 4 computer
Timing is everything: A spatio-temporal approach to the analysis of facial actions
This thesis presents a fully automatic facial expression analysis system based on the Facial Action
Coding System (FACS). FACS is the best known and the most commonly used system to describe
facial activity in terms of facial muscle actions (i.e., action units, AUs). We will present our research
on the analysis of the morphological, spatio-temporal and behavioural aspects of facial expressions.
In contrast with most other researchers in the field who use appearance based techniques, we use a
geometric feature based approach. We will argue that that approach is more suitable for analysing
facial expression temporal dynamics. Our system is capable of explicitly exploring the temporal
aspects of facial expressions from an input colour video in terms of their onset (start), apex (peak)
and offset (end).
The fully automatic system presented here detects 20 facial points in the first frame and tracks them
throughout the video. From the tracked points we compute geometry-based features which serve as
the input to the remainder of our systems. The AU activation detection system uses GentleBoost
feature selection and a Support Vector Machine (SVM) classifier to find which AUs were present in an
expression. Temporal dynamics of active AUs are recognised by a hybrid GentleBoost-SVM-Hidden
Markov model classifier. The system is capable of analysing 23 out of 27 existing AUs with high
accuracy.
The main contributions of the work presented in this thesis are the following: we have created a
method for fully automatic AU analysis with state-of-the-art recognition results. We have proposed
for the first time a method for recognition of the four temporal phases of an AU. We have build the
largest comprehensive database of facial expressions to date. We also present for the first time in the
literature two studies for automatic distinction between posed and spontaneous expressions
- …