4 research outputs found

    Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning

    Full text link
    State-of-the-art face recognition (FR) models often experience a significant performance drop when dealing with facial images in surveillance scenarios where images are in low quality and often corrupted with noise. Leveraging facial characteristics, such as freckles, scars, gender, and ethnicity, becomes highly beneficial in improving FR performance in such scenarios. In this paper, we introduce text-guided face recognition (TGFR) to analyze the impact of integrating facial attributes in the form of natural language descriptions. We hypothesize that adding semantic information into the loop can significantly improve the image understanding capability of an FR algorithm compared to other soft biometrics. However, learning a discriminative joint embedding within the multimodal space poses a considerable challenge due to the semantic gap in the unaligned image-text representations, along with the complexities arising from ambiguous and incoherent textual descriptions of the face. To address these challenges, we introduce a face-caption alignment module (FCAM), which incorporates cross-modal contrastive losses across multiple granularities to maximize the mutual information between local and global features of the face-caption pair. Within FCAM, we refine both facial and textual features for learning aligned and discriminative features. We also design a face-caption fusion module (FCFM) that applies fine-grained interactions and coarse-grained associations among cross-modal features. Through extensive experiments conducted on three face-caption datasets, proposed TGFR demonstrates remarkable improvements, particularly on low-quality images, over existing FR models and outperforms other related methods and benchmarks.Comment: Accepted at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 202

    Robust Ensemble Morph Detection with Domain Generalization

    Full text link
    Although a substantial amount of studies is dedicated to morph detection, most of them fail to generalize for morph faces outside of their training paradigm. Moreover, recent morph detection methods are highly vulnerable to adversarial attacks. In this paper, we intend to learn a morph detection model with high generalization to a wide range of morphing attacks and high robustness against different adversarial attacks. To this aim, we develop an ensemble of convolutional neural networks (CNNs) and Transformer models to benefit from their capabilities simultaneously. To improve the robust accuracy of the ensemble model, we employ multi-perturbation adversarial training and generate adversarial examples with high transferability for several single models. Our exhaustive evaluations demonstrate that the proposed robust ensemble model generalizes to several morphing attacks and face datasets. In addition, we validate that our robust ensemble model gain better robustness against several adversarial attacks while outperforming the state-of-the-art studies.Comment: Accepted in IJCB 202

    An EMD-based Method for the Detection of Power Transformer Faults with a Hierarchical Ensemble Classifier

    No full text
    In this paper, an Empirical Mode Decomposition-based method is proposed for the detection of transformer faults from Dissolve gas analysis (DGA) data. Ratio-based DGA parameters are ranked using their skewness. Optimal sets of intrinsic mode function coefficients are obtained from the ranked DGA parameters. A Hierarchical classification scheme employing XGBoost is presented for classifying the features to identify six different categories of transformer faults. Performance of the Proposed Method is studied for publicly available DGA data of 377 transformers. It is shown that the proposed method can yield more than 90% sensitivity and accuracy in the detection of transformer faults, a superior performance as compared to conventional methods as well as several existing machine learning-based techniques.Comment: 04 pages, 04 figures, Conferenc
    corecore