12,764 research outputs found
Human Pose Estimation using Deep Consensus Voting
In this paper we consider the problem of human pose estimation from a single
still image. We propose a novel approach where each location in the image votes
for the position of each keypoint using a convolutional neural net. The voting
scheme allows us to utilize information from the whole image, rather than rely
on a sparse set of keypoint locations. Using dense, multi-target votes, not
only produces good keypoint predictions, but also enables us to compute
image-dependent joint keypoint probabilities by looking at consensus voting.
This differs from most previous methods where joint probabilities are learned
from relative keypoint locations and are independent of the image. We finally
combine the keypoints votes and joint probabilities in order to identify the
optimal pose configuration. We show our competitive performance on the MPII
Human Pose and Leeds Sports Pose datasets
Mass Displacement Networks
Despite the large improvements in performance attained by using deep learning
in computer vision, one can often further improve results with some additional
post-processing that exploits the geometric nature of the underlying task. This
commonly involves displacing the posterior distribution of a CNN in a way that
makes it more appropriate for the task at hand, e.g. better aligned with local
image features, or more compact. In this work we integrate this geometric
post-processing within a deep architecture, introducing a differentiable and
probabilistically sound counterpart to the common geometric voting technique
used for evidence accumulation in vision. We refer to the resulting neural
models as Mass Displacement Networks (MDNs), and apply them to human pose
estimation in two distinct setups: (a) landmark localization, where we collapse
a distribution to a point, allowing for precise localization of body keypoints
and (b) communication across body parts, where we transfer evidence from one
part to the other, allowing for a globally consistent pose estimate. We
evaluate on large-scale pose estimation benchmarks, such as MPII Human Pose and
COCO datasets, and report systematic improvements when compared to strong
baselines.Comment: 12 pages, 4 figure
Multi-View Face Recognition From Single RGBD Models of the Faces
This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks
Face Alignment Assisted by Head Pose Estimation
In this paper we propose a supervised initialization scheme for cascaded face
alignment based on explicit head pose estimation. We first investigate the
failure cases of most state of the art face alignment approaches and observe
that these failures often share one common global property, i.e. the head pose
variation is usually large. Inspired by this, we propose a deep convolutional
network model for reliable and accurate head pose estimation. Instead of using
a mean face shape, or randomly selected shapes for cascaded face alignment
initialisation, we propose two schemes for generating initialisation: the first
one relies on projecting a mean 3D face shape (represented by 3D facial
landmarks) onto 2D image under the estimated head pose; the second one searches
nearest neighbour shapes from the training set according to head pose distance.
By doing so, the initialisation gets closer to the actual shape, which enhances
the possibility of convergence and in turn improves the face alignment
performance. We demonstrate the proposed method on the benchmark 300W dataset
and show very competitive performance in both head pose estimation and face
alignment.Comment: Accepted by BMVC201
PifPaf: Composite Fields for Human Pose Estimation
We propose a new bottom-up method for multi-person 2D human pose estimation
that is particularly well suited for urban mobility such as self-driving cars
and delivery robots. The new method, PifPaf, uses a Part Intensity Field (PIF)
to localize body parts and a Part Association Field (PAF) to associate body
parts with each other to form full human poses. Our method outperforms previous
methods at low resolution and in crowded, cluttered and occluded scenes thanks
to (i) our new composite field PAF encoding fine-grained information and (ii)
the choice of Laplace loss for regressions which incorporates a notion of
uncertainty. Our architecture is based on a fully convolutional, single-shot,
box-free design. We perform on par with the existing state-of-the-art bottom-up
method on the standard COCO keypoint task and produce state-of-the-art results
on a modified COCO keypoint task for the transportation domain.Comment: CVPR 201
- …