5,617 research outputs found
Predictive biometrics: A review and analysis of predicting personal characteristics from biometric data
Interest in the exploitation of soft biometrics information has continued to develop over the last decade or so. In comparison with traditional biometrics, which focuses principally on person identification, the idea of soft biometrics processing is to study the utilisation of more general information regarding a system user, which is not necessarily unique. There are increasing indications that this type of data will have great value in providing complementary information for user authentication. However, the authors have also seen a growing interest in broadening the predictive capabilities of biometric data, encompassing both easily definable characteristics such as subject age and, most recently, `higher level' characteristics such as emotional or mental states. This study will present a selective review of the predictive capabilities, in the widest sense, of biometric data processing, providing an analysis of the key issues still adequately to be addressed if this concept of predictive biometrics is to be fully exploited in the future
Cultural dialects of real and synthetic emotional facial expressions
In this article we discuss the aspects of designing facial expressions for virtual humans (VHs) with a specific culture. First we explore the notion of cultures and its relevance for applications with a VH. Then we give a general scheme of designing emotional facial expressions, and identify the stages where a human is involved, either as a real person with some specific role, or as a VH displaying facial expressions. We discuss how the display and the emotional meaning of facial expressions may be measured in objective ways, and how the culture of displayers and the judges may influence the process of analyzing human facial expressions and evaluating synthesized ones. We review psychological experiments on cross-cultural perception of emotional facial expressions. By identifying the culturally critical issues of data collection and interpretation with both real and VHs, we aim at providing a methodological reference and inspiration for further research
Deep Sketch-Photo Face Recognition Assisted by Facial Attributes
In this paper, we present a deep coupled framework to address the problem of
matching sketch image against a gallery of mugshots. Face sketches have the
essential in- formation about the spatial topology and geometric details of
faces while missing some important facial attributes such as ethnicity, hair,
eye, and skin color. We propose a cou- pled deep neural network architecture
which utilizes facial attributes in order to improve the sketch-photo
recognition performance. The proposed Attribute-Assisted Deep Con- volutional
Neural Network (AADCNN) method exploits the facial attributes and leverages the
loss functions from the facial attributes identification and face verification
tasks in order to learn rich discriminative features in a common em- bedding
subspace. The facial attribute identification task increases the inter-personal
variations by pushing apart the embedded features extracted from individuals
with differ- ent facial attributes, while the verification task reduces the
intra-personal variations by pulling together all the fea- tures that are
related to one person. The learned discrim- inative features can be well
generalized to new identities not seen in the training data. The proposed
architecture is able to make full use of the sketch and complementary fa- cial
attribute information to train a deep model compared to the conventional
sketch-photo recognition methods. Exten- sive experiments are performed on
composite (E-PRIP) and semi-forensic (IIIT-D semi-forensic) datasets. The
results show the superiority of our method compared to the state- of-the-art
models in sketch-photo recognition algorithm
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
Concordant cues in faces and voices: testing the backup signal hypothesis
Information from faces and voices combines to provide multimodal signals about a person. Faces and voices may offer redundant, overlapping (backup signals), or complementary information (multiple messages). This article reports two experiments which investigated the extent to which faces and voices deliver concordant information about dimensions of fitness and quality. In Experiment 1, participants rated faces and voices on scales for masculinity/femininity, age, health, height, and weight. The results showed that people make similar judgments from faces and voices, with particularly strong correlations for masculinity/femininity, health, and height. If, as these results suggest, faces and voices constitute backup signals for various dimensions, it is hypothetically possible that people would be able to accurately match novel faces and voices for identity. However, previous investigations into novel face–voice matching offer contradictory results. In Experiment 2, participants saw a face and heard a voice and were required to decide whether the face and voice belonged to the same person. Matching accuracy was significantly above chance level, suggesting that judgments made independently from faces and voices are sufficiently similar that people can match the two. Both sets of results were analyzed using multilevel modeling and are interpreted as being consistent with the backup signal hypothesis
A Survey on Soft Biometrics for Human Identification
The focus has been changed to multi-biometrics due to the security demands. The ancillary information extracted from primary biometric (face and body) traits such as facial measurements, gender, color of the skin, ethnicity, and height is called soft biometrics and can be integrated to improve the speed and overall system performance of a primary biometric system (e.g., fuse face with facial marks) or to generate human semantic interpretation description (qualitative) of a person and limit the search in the whole dataset when using gender and ethnicity (e.g., old African male with blue eyes) in a fusion framework. This chapter provides a holistic survey on soft biometrics that show major works while focusing on facial soft biometrics and discusses some of the features of extraction and classification techniques that have been proposed and show their strengths and limitations
Improving speaker turn embedding by crossmodal transfer learning from face embedding
Learning speaker turn embeddings has shown considerable improvement in
situations where conventional speaker modeling approaches fail. However, this
improvement is relatively limited when compared to the gain observed in face
embedding learning, which has been proven very successful for face verification
and clustering tasks. Assuming that face and voices from the same identities
share some latent properties (like age, gender, ethnicity), we propose three
transfer learning approaches to leverage the knowledge from the face domain
(learned from thousands of images and identities) for tasks in the speaker
domain. These approaches, namely target embedding transfer, relative distance
transfer, and clustering structure transfer, utilize the structure of the
source face embedding space at different granularities to regularize the target
speaker turn embedding space as optimizing terms. Our methods are evaluated on
two public broadcast corpora and yield promising advances over competitive
baselines in verification and audio clustering tasks, especially when dealing
with short speaker utterances. The analysis of the results also gives insight
into characteristics of the embedding spaces and shows their potential
applications
- …