362 research outputs found
Robust statistical frontalization of human and animal faces
The unconstrained acquisition of facial data in real-world conditions may result in face images with significant pose variations, illumination changes, and occlusions, affecting the performance of facial landmark localization and recognition methods. In this paper, a novel method, robust to pose, illumination variations, and occlusions is proposed for joint face frontalization and landmark localization. Unlike the state-of-the-art methods for landmark localization and pose correction, where large amount of manually annotated images or 3D facial models are required, the proposed method relies on a small set of frontal images only. By observing that the frontal facial image of both humans and animals, is the one having the minimum rank of all different poses, a model which is able to jointly recover the frontalized version of the face as well as the facial landmarks is devised. To this end, a suitable optimization problem is solved, concerning minimization of the nuclear norm (convex surrogate of the rank function) and the matrix â„“1 norm accounting for occlusions. The proposed method is assessed in frontal view reconstruction of human and animal faces, landmark localization, pose-invariant face recognition, face verification in unconstrained conditions, and video inpainting by conducting experiment on 9 databases. The experimental results demonstrate the effectiveness of the proposed method in comparison to the state-of-the-art methods for the target problems
Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets
This paper targets the problem of image set-based face verification and
identification. Unlike traditional single media (an image or video) setting, we
encounter a set of heterogeneous contents containing orderless images and
videos. The importance of each image is usually considered either equal or
based on their independent quality assessment. How to model the relationship of
orderless images within a set remains a challenge. We address this problem by
formulating it as a Markov Decision Process (MDP) in the latent space.
Specifically, we first present a dependency-aware attention control (DAC)
network, which resorts to actor-critic reinforcement learning for sequential
attention decision of each image embedding to fully exploit the rich
correlation cues among the unordered images. Moreover, we introduce its
sample-efficient variant with off-policy experience replay to speed up the
learning process. The pose-guided representation scheme can further boost the
performance at the extremes of the pose variation.Comment: Fixed the unreadable code in CVF version. arXiv admin note: text
overlap with arXiv:1707.00130 by other author
Set-Based Face Recognition Beyond Disentanglement: Burstiness Suppression With Variance Vocabulary
Set-based face recognition (SFR) aims to recognize the face sets in the
unconstrained scenario, where the appearance of same identity may change
dramatically with extreme variances (e.g., illumination, pose, expression). We
argue that the two crucial issues in SFR, the face quality and burstiness, are
both identity-irrelevant and variance-relevant. The quality and burstiness
assessment are interfered with by the entanglement of identity, and the face
recognition is interfered with by the entanglement of variance. Thus we propose
to separate the identity features with the variance features in a
light-weighted set-based disentanglement framework. Beyond disentanglement, the
variance features are fully utilized to indicate face quality and burstiness in
a set, rather than being discarded after training. To suppress face burstiness
in the sets, we propose a vocabulary-based burst suppression (VBS) method which
quantizes faces with a reference vocabulary. With interword and intra-word
normalization operations on the assignment scores, the face burtisness degrees
are appropriately estimated. The extensive illustrations and experiments
demonstrate the effect of the disentanglement framework with VBS, which gets
new state-of-the-art on the SFR benchmarks. The code will be released at
https://github.com/Liubinggunzu/set_burstiness.Comment: ACM MM 2022 accepted, code will be release
Dynamic probabilistic linear discriminant analysis for video classification
Component Analysis (CA) comprises of statistical techniques that decompose signals into appropriate latent components, relevant to a task-at-hand (e.g., clustering, segmentation, classification). Recently, an explosion of research in CA has been witnessed, with several novel probabilistic models proposed (e.g., Probabilistic Principal CA, Probabilistic Linear Discriminant Analysis (PLDA), Probabilistic Canonical Correlation Analysis). PLDA is a popular generative probabilistic CA method, that incorporates knowledge regarding class-labels and furthermore introduces class-specific and sample-specific latent spaces. While PLDA has been shown to outperform several state-of-the-art methods, it is nevertheless a static model; any feature-level temporal dependencies that arise in the data are ignored. As has been repeatedly shown, appropriate modelling of temporal dynamics is crucial for the analysis of temporal data (e.g., videos). In this light, we propose the first, to the best of our knowledge, probabilistic LDA formulation that models dynamics, the so-called Dynamic-PLDA (DPLDA). DPLDA is a generative model suitable for video classification and is able to jointly model the label information (e.g., face identity, consistent over videos of the same subject), as well as dynamic variations of each individual video. Experiments on video classification tasks such as face and facial expression recognition show the efficacy of the proposed metho
- …