1,545 research outputs found

    Pose-Normalized Image Generation for Person Re-identification

    Full text link
    Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations. In this work, we address both problems by proposing a novel deep person image generation model for synthesizing realistic person images conditional on the pose. The model is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN (PN-GAN). With the synthesized images, we can learn a new type of deep re-id feature free of the influence of pose variations. We show that this feature is strong on its own and complementary to features learned with the original images. Importantly, under the transfer learning setting, we show that our model generalizes well to any new re-id dataset without the need for collecting any training data for model fine-tuning. The model thus has the potential to make re-id model truly scalable.Comment: 10 pages, 5 figure

    LEED: Label-Free Expression Editing via Disentanglement

    Full text link
    Recent studies on facial expression editing have obtained very promising progress. On the other hand, existing methods face the constraint of requiring a large amount of expression labels which are often expensive and time-consuming to collect. This paper presents an innovative label-free expression editing via disentanglement (LEED) framework that is capable of editing the expression of both frontal and profile facial images without requiring any expression label. The idea is to disentangle the identity and expression of a facial image in the expression manifold, where the neutral face captures the identity attribute and the displacement between the neutral image and the expressive image captures the expression attribute. Two novel losses are designed for optimal expression disentanglement and consistent synthesis, including a mutual expression information loss that aims to extract pure expression-related features and a siamese loss that aims to enhance the expression similarity between the synthesized image and the reference image. Extensive experiments over two public facial expression datasets show that LEED achieves superior facial expression editing qualitatively and quantitatively.Comment: Accepted to ECCV 202

    SELM: Siamese extreme learning machine with application to face biometrics

    Full text link
    This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s00521-022-07100-zExtreme learning machine (ELM) is a powerful classification method and is very competitive among existing classification methods. It is speedy at training. Nevertheless, it cannot perform face verification tasks properly because face verification tasks require the comparison of facial images of two individuals simultaneously and decide whether the two faces identify the same person. The ELM structure was not designed to feed two input data streams simultaneously. Thus, in 2-input scenarios, ELM methods are typically applied using concatenated inputs. However, this setup consumes two times more computational resources, and it is not optimized for recognition tasks where learning a separable distance metric is critical. For these reasons, we propose and develop a Siamese extreme learning machine (SELM). SELM was designed to be fed with two data streams in parallel simultaneously. It utilizes a dual-stream Siamese condition in the extra Siamese layer to transform the data before passing it to the hidden layer. Moreover, we propose a Gender-Ethnicity-dependent triplet feature exclusively trained on various specific demographic groups. This feature enables learning and extracting useful facial features of each group. Experiments were conducted to evaluate and compare the performances of SELM, ELM, and deep convolutional neural network (DCNN). The experimental results showed that the proposed feature could perform correct classification at 97:87% accuracy and 99:45% area under the curve (AUC). They also showed that using SELM in conjunction with the proposed feature provided 98:31% accuracy and 99:72% AUC. SELM outperformed the robust performances over the well-known DCNN and ELM methods.This work was supported by the Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang and projects BIBECA (RTI2018-101248-B-I00 MINECO/FEDER) and BBforTAI (PID2021-127641OB-I00 MICINN/FEDER)
    • …
    corecore