4 research outputs found
Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning
State-of-the-art face recognition (FR) models often experience a significant
performance drop when dealing with facial images in surveillance scenarios
where images are in low quality and often corrupted with noise. Leveraging
facial characteristics, such as freckles, scars, gender, and ethnicity, becomes
highly beneficial in improving FR performance in such scenarios. In this paper,
we introduce text-guided face recognition (TGFR) to analyze the impact of
integrating facial attributes in the form of natural language descriptions. We
hypothesize that adding semantic information into the loop can significantly
improve the image understanding capability of an FR algorithm compared to other
soft biometrics. However, learning a discriminative joint embedding within the
multimodal space poses a considerable challenge due to the semantic gap in the
unaligned image-text representations, along with the complexities arising from
ambiguous and incoherent textual descriptions of the face. To address these
challenges, we introduce a face-caption alignment module (FCAM), which
incorporates cross-modal contrastive losses across multiple granularities to
maximize the mutual information between local and global features of the
face-caption pair. Within FCAM, we refine both facial and textual features for
learning aligned and discriminative features. We also design a face-caption
fusion module (FCFM) that applies fine-grained interactions and coarse-grained
associations among cross-modal features. Through extensive experiments
conducted on three face-caption datasets, proposed TGFR demonstrates remarkable
improvements, particularly on low-quality images, over existing FR models and
outperforms other related methods and benchmarks.Comment: Accepted at IEEE/CVF Winter Conference on Applications of Computer
Vision (WACV), 202
Robust Ensemble Morph Detection with Domain Generalization
Although a substantial amount of studies is dedicated to morph detection,
most of them fail to generalize for morph faces outside of their training
paradigm. Moreover, recent morph detection methods are highly vulnerable to
adversarial attacks. In this paper, we intend to learn a morph detection model
with high generalization to a wide range of morphing attacks and high
robustness against different adversarial attacks. To this aim, we develop an
ensemble of convolutional neural networks (CNNs) and Transformer models to
benefit from their capabilities simultaneously. To improve the robust accuracy
of the ensemble model, we employ multi-perturbation adversarial training and
generate adversarial examples with high transferability for several single
models. Our exhaustive evaluations demonstrate that the proposed robust
ensemble model generalizes to several morphing attacks and face datasets. In
addition, we validate that our robust ensemble model gain better robustness
against several adversarial attacks while outperforming the state-of-the-art
studies.Comment: Accepted in IJCB 202
An EMD-based Method for the Detection of Power Transformer Faults with a Hierarchical Ensemble Classifier
In this paper, an Empirical Mode Decomposition-based method is proposed for
the detection of transformer faults from Dissolve gas analysis (DGA) data.
Ratio-based DGA parameters are ranked using their skewness. Optimal sets of
intrinsic mode function coefficients are obtained from the ranked DGA
parameters. A Hierarchical classification scheme employing XGBoost is presented
for classifying the features to identify six different categories of
transformer faults. Performance of the Proposed Method is studied for publicly
available DGA data of 377 transformers. It is shown that the proposed method
can yield more than 90% sensitivity and accuracy in the detection of
transformer faults, a superior performance as compared to conventional methods
as well as several existing machine learning-based techniques.Comment: 04 pages, 04 figures, Conferenc
