7 research outputs found
Quality-Aware Prototype Memory for Face Representation Learning
Prototype Memory is a powerful model for face representation learning. It
enables the training of face recognition models using datasets of any size,
with on-the-fly generation of prototypes (classifier weights) and efficient
ways of their utilization. Prototype Memory demonstrated strong results in many
face recognition benchmarks. However, the algorithm of prototype generation,
used in it, is prone to the problems of imperfectly calculated prototypes in
case of low-quality or poorly recognizable faces in the images, selected for
the prototype creation. All images of the same person, presented in the
mini-batch, used with equal weights, and the resulting averaged prototype could
be contaminated with imperfect embeddings of such face images. It can lead to
misdirected training signals and impair the performance of the trained face
recognition models. In this paper, we propose a simple and effective way to
improve Prototype Memory with quality-aware prototype generation. Quality-Aware
Prototype Memory uses different weights for images of different quality in the
process of prototype generation. With this improvement, prototypes get more
valuable information from high-quality images and less hurt by low-quality
ones. We propose and compare several methods of quality estimation and usage,
perform extensive experiments on the different face recognition benchmarks and
demonstrate the advantages of the proposed model compared to the basic version
of Prototype Memory.Comment: Preprin
Towards Large-scale Masked Face Recognition
During the COVID-19 coronavirus epidemic, almost everyone is wearing masks,
which poses a huge challenge for deep learning-based face recognition
algorithms. In this paper, we will present our \textbf{championship} solutions
in ICCV MFR WebFace260M and InsightFace unconstrained tracks. We will focus on
four challenges in large-scale masked face recognition, i.e., super-large scale
training, data noise handling, masked and non-masked face recognition accuracy
balancing, and how to design inference-friendly model architecture. We hope
that the discussion on these four aspects can guide future research towards
more robust masked face recognition systems.Comment: the top1 solution for ICCV2021-MFR challeng
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection
The introduction of DETR represents a new paradigm for object detection.
However, its decoder conducts classification and box localization using shared
queries and cross-attention layers, leading to suboptimal results. We observe
that different regions of interest in the visual feature map are suitable for
performing query classification and box localization tasks, even for the same
object. Salient regions provide vital information for classification, while the
boundaries around them are more favorable for box regression. Unfortunately,
such spatial misalignment between these two tasks greatly hinders DETR's
training. Therefore, in this work, we focus on decoupling localization and
classification tasks in DETR. To achieve this, we introduce a new design scheme
called spatially decoupled DETR (SD-DETR), which includes a task-aware query
generation module and a disentangled feature learning process. We elaborately
design the task-aware query initialization process and divide the
cross-attention block in the decoder to allow the task-aware queries to match
different visual regions. Meanwhile, we also observe that the prediction
misalignment problem for high classification confidence and precise
localization exists, so we propose an alignment loss to further guide the
spatially decoupled DETR training. Through extensive experiments, we
demonstrate that our approach achieves a significant improvement in MSCOCO
datasets compared to previous work. For instance, we improve the performance of
Conditional DETR by 4.5 AP. By spatially disentangling the two tasks, our
method overcomes the misalignment problem and greatly improves the performance
of DETR for object detection.Comment: accepted by ICCV202
Face.evoLVe: A High-Performance Face Recognition Library
In this paper, we develop face.evoLVe -- a comprehensive library that
collects and implements a wide range of popular deep learning-based methods for
face recognition. First of all, face.evoLVe is composed of key components that
cover the full process of face analytics, including face alignment, data
processing, various backbones, losses, and alternatives with bags of tricks for
improving performance. Later, face.evoLVe supports multi-GPU training on top of
different deep learning platforms, such as PyTorch and PaddlePaddle, which
facilitates researchers to work on both large-scale datasets with millions of
images and low-shot counterparts with limited well-annotated data. More
importantly, along with face.evoLVe, images before & after alignment in the
common benchmark datasets are released with source codes and trained models
provided. All these efforts lower the technical burdens in reproducing the
existing methods for comparison, while users of our library could focus on
developing advanced approaches more efficiently. Last but not least,
face.evoLVe is well designed and vibrantly evolving, so that new face
recognition approaches can be easily plugged into our framework. Note that we
have used face.evoLVe to participate in a number of face recognition
competitions and secured the first place. The version that supports PyTorch is
publicly available at https://github.com/ZhaoJ9014/face.evoLVe.PyTorch and the
PaddlePaddle version is available at
https://github.com/ZhaoJ9014/face.evoLVe.PyTorch/tree/master/paddle.
Face.evoLVe has been widely used for face analytics, receiving 2.4K stars and
622 forks.Comment: A short verson is accepted by NeuroComputing
(https://www.sciencedirect.com/science/article/pii/S0925231222005057?via%3Dihub).
Primary corresponding author is Dr. Jian Zha
A Survey of Face Recognition
Recent years witnessed the breakthrough of face recognition with deep
convolutional neural networks. Dozens of papers in the field of FR are
published every year. Some of them were applied in the industrial community and
played an important role in human life such as device unlock, mobile payment,
and so on. This paper provides an introduction to face recognition, including
its history, pipeline, algorithms based on conventional manually designed
features or deep learning, mainstream training, evaluation datasets, and
related applications. We have analyzed and compared state-of-the-art works as
many as possible, and also carefully designed a set of experiments to find the
effect of backbone size and data distribution. This survey is a material of the
tutorial named The Practical Face Recognition Technology in the Industrial
World in the FG2023