1,019 research outputs found
Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos
In recent years, heatmap regression based models have shown their
effectiveness in face alignment and pose estimation. However, Conventional
Heatmap Regression (CHR) is not accurate nor stable when dealing with
high-resolution facial videos, since it finds the maximum activated location in
heatmaps which are generated from rounding coordinates, and thus leads to
quantization errors when scaling back to the original high-resolution space. In
this paper, we propose a Fractional Heatmap Regression (FHR) for
high-resolution video-based face alignment. The proposed FHR can accurately
estimate the fractional part according to the 2D Gaussian function by sampling
three points in heatmaps. To further stabilize the landmarks among continuous
video frames while maintaining the precise at the same time, we propose a novel
stabilization loss that contains two terms to address time delay and non-smooth
issues, respectively. Experiments on 300W, 300-VW and Talking Face datasets
clearly demonstrate that the proposed method is more accurate and stable than
the state-of-the-art models.Comment: Accepted to AAAI 2019. 8 pages, 7 figure
Recommended from our members
Towards a conceptualization of casual protest participation: Parsing a case from the Save Roşia Montană campaign
There is currently an empirical gap in the literature on protest participation in liberal democracies which has overwhelmingly focused on Western Europe and North America at the expense of Eastern Europe. To contribute to closing that gap, this article reviews findings from a multi-method field study conducted at FânFest, the environmental protest festival designed to boost participation in Save Roşia Montană, the most prominent environmental campaign in Romania. By contrast to its Western counterparts, Romania has seen markedly lower levels of involvement in voluntary organizations that are a key setting for mobilization into collective action. Concurrently, experience with participation in physical protests is limited amongst Romanians. Specifically, the article probes recent indications that social network sites provide new impetus to protest participation as an instrumental means of mobilization. Dwelling on a distinction between experienced and newcomers to protest, results indicate that social network site usage may make possible the casual participation of individuals with prior protest experience who are not activists in a voluntary organization. Whilst this finding may signal a new participatory mode hinging on digitally networked communication which is beginning to be theorized, it confounds expectations pertaining to a net contribution of social network site usage to the participation of newcomers to protest
A survey of face recognition techniques under occlusion
The limited capacity to recognize faces under occlusions is a long-standing
problem that presents a unique challenge for face recognition systems and even
for humans. The problem regarding occlusion is less covered by research when
compared to other challenges such as pose variation, different expressions,
etc. Nevertheless, occluded face recognition is imperative to exploit the full
potential of face recognition for real-world applications. In this paper, we
restrict the scope to occluded face recognition. First, we explore what the
occlusion problem is and what inherent difficulties can arise. As a part of
this review, we introduce face detection under occlusion, a preliminary step in
face recognition. Second, we present how existing face recognition methods cope
with the occlusion problem and classify them into three categories, which are
1) occlusion robust feature extraction approaches, 2) occlusion aware face
recognition approaches, and 3) occlusion recovery based face recognition
approaches. Furthermore, we analyze the motivations, innovations, pros and
cons, and the performance of representative approaches for comparison. Finally,
future challenges and method trends of occluded face recognition are thoroughly
discussed
ROBUST FACIAL LANDMARKS LOCALIZATION WITH APPLICATIONS IN FACIAL BIOMETRICS
Localization of regions of interest on images and videos is a well studied prob-
lem in computer vision community. Usually localization tasks imply localization of
objects in a given image, such as detection and segmentation of objects in images.
However, the regions of interests can be limited to a single pixel as in the task of
facial landmark localization or human pose estimation. This dissertation studies ro-
bust facial landmark detection algorithms for faces in the wild using learning methods
based on Convolution Neural Networks.
Detection of specific keypoints on face images is an integral pre-processing step
in facial biometrics and numerous other applications including face verification and
identification. Detecting keypoints allows to align face images to a canonical coordi-
nate system using geometric transforms such as similarity or affine transformations
mitigating the adverse affects of rotation and scaling. This challenging problem has
become more attractive in recent years as a result of advances in deep learning and
release of more unconstrained datasets. The research community is pushing bound-aries to achieve better and better performance on unconstrained images, where the
images are diverse in pose, expression and lightning conditions.
Over the years, researchers have developed various hand crafted techniques
to extract meaningful features from features, most of them being appearance and
geometry-based features. However, these features do not perform well for data col-
lected in unconstrained settings due to large variations in appearance and other nui-
sance factors. Convolution Neural Networks (CNNs) have become prominent because
of their ability to extract discriminating features. Unlike the hand crafted features,
DCNNs perform feature extraction and feature classification from the data itself in
an end-to-end fashion. This enables the DCNNs to be robust to variations present
in the data and at the same time improve their discriminative ability.
In this dissertation, we discuss three different methods for facial keypoint de-
tection based on Convolution Neural Networks. The methods are generic and can be
extended to a related problem of keypoint detection for human pose estimation. The
first method called Cascaded Local Deep Descriptor Regression uses deep features ex-
tracted around local points to learn linear regressors for incrementally correcting the
initial estimate of the keypoints. In the second method, called KEPLER, we develop
efficient Heatmap CNNs to directly learn the non-linear mapping between the input
and target spaces. We also apply different regularization techniques to tackle the
effects of imbalanced data and vanishing gradients. In the third method, we model
the spatial correlation between different keypoints using Pose Conditioned Convo-
lution Deconvolution Networks (PCD-CNN) while at the same time making it pose
agnostic by disentangling pose from the face image. Next, we show an applicationof facial landmark localization used to align the face images for the task of apparent
age estimation of humans from unconstrained images.
In the fourth part of this dissertation we discuss the impact of good quality
landmarks on the task of face verification. Previously proposed methods perform
with reasonable accuracy on high resolution and good quality images, but fail when
the input image suffers from degradation. To this end, we propose a semi-supervised
method which aims at predicting landmarks in the low quality images. This method
learns to predict landmarks in low resolution images by learning to model the learning
process of high resolution images. In this algorithm, we use Generative Adversarial
Networks, which first learn to model the distribution of real low resolution images
after which another CNN learns to model the distribution of heatmaps on the images.
Additionally, we also propose another high quality facial landmark detection method,
which is currently state of the art.
Finally, we also discuss the extension of ideas developed for facial keypoint
localization for the task of human pose estimation, which is one of the important
cues for Human Activity Recognition. As in PCD-CNN, the parts of human body
can also be modelled in a tree structure, where the relationship between these parts are
learnt through convolutions while being conditioned on the 3D pose and orientation.
Another interesting avenue for research is extending facial landmark localization to
naturally degraded images
The doctoral research abstracts. Vol:7 2015 / Institute of Graduate Studies, UiTM
Foreword:
The Seventh Issue of The Doctoral Research Abstracts captures the novelty of
65 doctorates receiving their scrolls in UiTM’s 82nd Convocation in the field of
Science and Technology, Business and Administration, and Social Science and
Humanities. To the recipients I would like to say that you have most certainly
done UiTM proud by journeying through the scholastic path with its endless
challenges and impediments, and persevering right till the very end.
This convocation should not be regarded as the end of your highest scholarly
achievement and contribution to the body of knowledge but rather as the
beginning of embarking into high impact innovative research for the
community and country from knowledge gained during this academic
journey.
As alumni of UiTM, we will always hold you dear to our hearts. A new
‘handshake’ is about to take place between you and UiTM as joint
collaborators in future research undertakings. I envisioned a strong
research pact between you as our alumni and UiTM in breaking the
frontier of knowledge through research.
I wish you all the best in your endeavour and may I offer my
congratulations to all the graduands. ‘UiTM sentiasa dihati ku’ /
Tan Sri Dato’ Sri Prof Ir Dr Sahol Hamid Abu Bakar , FASc, PEng
Vice Chancellor
Universiti Teknologi MAR
Recommended from our members
Towards Universal Object Detection
Object detection is one of the most important and challenging research topics in computer vision. It is playing an important role in our everyday life and has many applications, e.g. surveillance, autonomous driving, robotics, drone, medical imaging, etc. The ultimate goal of object detection is a universal object detector that can work very well in any case under any condition like human vision system. However, there are multiple challenges on the universality of object detection, e.g. scale-variance, high-quality requirement, domain shift, computational constraint, etc. These will prevent the object detector from being widely used for various scales of objects, critical applications requiring extremely accurate localization, scenarios with changing domain priors, and diverse hardware settings. To address these challenges, multiple solutions have been proposed in this thesis. These include an efficient multi-scale architecture to achieve scale-invariant detection, a robust multi-stage framework effective for high-quality requirement, a cross-domain solution to extend the universality over various domains, and a design of complexity-aware cascades and a novel low-precision network to enhance the universality under different computational constraints. All these efforts have substantially improved the universality of object detection, and the advanced object detector can be applied to broader environments
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Inspired by the fact that human brains can emphasize discriminative parts of
the input and suppress irrelevant ones, substantial local mechanisms have been
designed to boost the development of computer vision. They can not only focus
on target parts to learn discriminative local representations, but also process
information selectively to improve the efficiency. In terms of application
scenarios and paradigms, local mechanisms have different characteristics. In
this survey, we provide a systematic review of local mechanisms for various
computer vision tasks and approaches, including fine-grained visual
recognition, person re-identification, few-/zero-shot learning, multi-modal
learning, self-supervised learning, Vision Transformers, and so on.
Categorization of local mechanisms in each field is summarized. Then,
advantages and disadvantages for every category are analyzed deeply, leaving
room for exploration. Finally, future research directions about local
mechanisms have also been discussed that may benefit future works. To the best
our knowledge, this is the first survey about local mechanisms on computer
vision. We hope that this survey can shed light on future research in the
computer vision field
Reinforced Learning for Label-Efficient 3D Face Reconstruction
3D face reconstruction plays a major role in many human-robot interaction systems, from automatic face authentication to human-computer interface-based entertainment. To improve robustness against occlusions and noise, 3D face reconstruction networks are often trained on a set of in-the-wild face images preferably captured along different viewpoints of the subject. However, collecting the required large amounts of 3D annotated face data is expensive and time-consuming. To address the high annotation cost and due to the importance of training on a useful set, we propose an Active Learning (AL) framework that actively selects the most informative and representative samples to be labeled. To the best of our knowledge, this paper is the first work on tackling active learning for 3D face reconstruction to enable a label-efficient training strategy. In particular, we propose a Reinforcement Active Learning approach in conjunction with a clustering-based pooling strategy to select informative view-points of the subjects. Experimental results on 300W-LP and AFLW2000 datasets demonstrate that our proposed method is able to 1) efficiently select the most influencing view-points for labeling and outperforms several baseline AL techniques and 2) further improve the performance of a 3D Face Reconstruction network trained on the full dataset
- …