2 research outputs found
A High-Efficiency Framework for Constructing Large-Scale Face Parsing Benchmark
Face parsing, which is to assign a semantic label to each pixel in face
images, has recently attracted increasing interest due to its huge application
potentials. Although many face related fields (e.g., face recognition and face
detection) have been well studied for many years, the existing datasets for
face parsing are still severely limited in terms of the scale and quality,
e.g., the widely used Helen dataset only contains 2,330 images. This is mainly
because pixel-level annotation is a high cost and time-consuming work,
especially for the facial parts without clear boundaries. The lack of accurate
annotated datasets becomes a major obstacle in the progress of face parsing
task. It is a feasible way to utilize dense facial landmarks to guide the
parsing annotation. However, annotating dense landmarks on human face
encounters the same issues as the parsing annotation. To overcome the above
problems, in this paper, we develop a high-efficiency framework for face
parsing annotation, which considerably simplifies and speeds up the parsing
annotation by two consecutive modules. Benefit from the proposed framework, we
construct a new Dense Landmark Guided Face Parsing (LaPa) benchmark. It
consists of 22,000 face images with large variations in expression, pose,
occlusion, etc. Each image is provided with accurate annotation of a
11-category pixel-level label map along with coordinates of 106-point
landmarks. To the best of our knowledge, it is currently the largest public
dataset for face parsing. To make full use of our LaPa dataset with abundant
face shape and boundary priors, we propose a simple yet effective
Boundary-Sensitive Parsing Network (BSPNet). Our network is taken as a baseline
model on the proposed LaPa dataset, and meanwhile, it achieves the
state-of-the-art performance on the Helen dataset without resorting to extra
face alignment
Mis-classified Vector Guided Softmax Loss for Face Recognition
Face recognition has witnessed significant progress due to the advances of
deep convolutional neural networks (CNNs), the central task of which is how to
improve the feature discrimination. To this end, several margin-based
(\textit{e.g.}, angular, additive and additive angular margins) softmax loss
functions have been proposed to increase the feature margin between different
classes. However, despite great achievements have been made, they mainly suffer
from three issues: 1) Obviously, they ignore the importance of informative
features mining for discriminative learning; 2) They encourage the feature
margin only from the ground truth class, without realizing the discriminability
from other non-ground truth classes; 3) The feature margin between different
classes is set to be same and fixed, which may not adapt the situations very
well. To cope with these issues, this paper develops a novel loss function,
which adaptively emphasizes the mis-classified feature vectors to guide the
discriminative feature learning. Thus we can address all the above issues and
achieve more discriminative face features. To the best of our knowledge, this
is the first attempt to inherit the advantages of feature margin and feature
mining into a unified loss function. Experimental results on several benchmarks
have demonstrated the effectiveness of our method over state-of-the-art
alternatives.Comment: Accepted by AAAI2020 as Oral presentation. arXiv admin note:
substantial text overlap with arXiv:1812.1131