1,545 research outputs found
Pose-Normalized Image Generation for Person Re-identification
Person Re-identification (re-id) faces two major challenges: the lack of
cross-view paired training data and learning discriminative identity-sensitive
and view-invariant features in the presence of large pose variations. In this
work, we address both problems by proposing a novel deep person image
generation model for synthesizing realistic person images conditional on the
pose. The model is based on a generative adversarial network (GAN) designed
specifically for pose normalization in re-id, thus termed pose-normalization
GAN (PN-GAN). With the synthesized images, we can learn a new type of deep
re-id feature free of the influence of pose variations. We show that this
feature is strong on its own and complementary to features learned with the
original images. Importantly, under the transfer learning setting, we show that
our model generalizes well to any new re-id dataset without the need for
collecting any training data for model fine-tuning. The model thus has the
potential to make re-id model truly scalable.Comment: 10 pages, 5 figure
LEED: Label-Free Expression Editing via Disentanglement
Recent studies on facial expression editing have obtained very promising
progress. On the other hand, existing methods face the constraint of requiring
a large amount of expression labels which are often expensive and
time-consuming to collect. This paper presents an innovative label-free
expression editing via disentanglement (LEED) framework that is capable of
editing the expression of both frontal and profile facial images without
requiring any expression label. The idea is to disentangle the identity and
expression of a facial image in the expression manifold, where the neutral face
captures the identity attribute and the displacement between the neutral image
and the expressive image captures the expression attribute. Two novel losses
are designed for optimal expression disentanglement and consistent synthesis,
including a mutual expression information loss that aims to extract pure
expression-related features and a siamese loss that aims to enhance the
expression similarity between the synthesized image and the reference image.
Extensive experiments over two public facial expression datasets show that LEED
achieves superior facial expression editing qualitatively and quantitatively.Comment: Accepted to ECCV 202
SELM: Siamese extreme learning machine with application to face biometrics
This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s00521-022-07100-zExtreme learning machine (ELM) is a powerful classification method and is very competitive among existing classification
methods. It is speedy at training. Nevertheless, it cannot perform face verification tasks properly because face verification
tasks require the comparison of facial images of two individuals simultaneously and decide whether the two faces identify
the same person. The ELM structure was not designed to feed two input data streams simultaneously. Thus, in 2-input
scenarios, ELM methods are typically applied using concatenated inputs. However, this setup consumes two times more
computational resources, and it is not optimized for recognition tasks where learning a separable distance metric is critical.
For these reasons, we propose and develop a Siamese extreme learning machine (SELM). SELM was designed to be fed
with two data streams in parallel simultaneously. It utilizes a dual-stream Siamese condition in the extra Siamese layer to
transform the data before passing it to the hidden layer. Moreover, we propose a Gender-Ethnicity-dependent triplet feature
exclusively trained on various specific demographic groups. This feature enables learning and extracting useful facial
features of each group. Experiments were conducted to evaluate and compare the performances of SELM, ELM, and deep
convolutional neural network (DCNN). The experimental results showed that the proposed feature could perform correct
classification at 97:87% accuracy and 99:45% area under the curve (AUC). They also showed that using SELM in
conjunction with the proposed feature provided 98:31% accuracy and 99:72% AUC. SELM outperformed the robust
performances over the well-known DCNN and ELM methods.This work was supported by the Faculty of
Information Technology, King Mongkut’s Institute of Technology
Ladkrabang and projects BIBECA (RTI2018-101248-B-I00
MINECO/FEDER) and BBforTAI (PID2021-127641OB-I00
MICINN/FEDER)
- …