538 research outputs found
A CNN cascade for landmark guided semantic part segmentation
This paper proposes a CNN cascade for semantic part segmentation guided by pose-specifc information encoded in terms of a set of landmarks (or keypoints). There is large amount of prior work on each of these tasks separately, yet, to the best of our knowledge, this is the first time in literature that the interplay between pose estimation and semantic part segmentation is investigated. To address this limitation of prior work, in this paper, we propose a CNN cascade of tasks that firstly performs landmark localisation and then uses this information as input for guiding semantic part segmentation. We applied our architecture to the problem of facial part segmentation and report large performance improvement over the standard unguided network on the most challenging face datasets. Testing code and models will be published online at http://cs.nott.ac.uk/~psxasj/
Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians
Convolutional neural nets (CNNs) have demonstrated remarkable performance in
recent history. Such approaches tend to work in a unidirectional bottom-up
feed-forward fashion. However, practical experience and biological evidence
tells us that feedback plays a crucial role, particularly for detailed spatial
understanding tasks. This work explores bidirectional architectures that also
reason with top-down feedback: neural units are influenced by both lower and
higher-level units.
We do so by treating units as rectified latent variables in a quadratic
energy function, which can be seen as a hierarchical Rectified Gaussian model
(RGs). We show that RGs can be optimized with a quadratic program (QP), that
can in turn be optimized with a recurrent neural network (with rectified linear
units). This allows RGs to be trained with GPU-optimized gradient descent. From
a theoretical perspective, RGs help establish a connection between CNNs and
hierarchical probabilistic models. From a practical perspective, RGs are well
suited for detailed spatial tasks that can benefit from top-down reasoning. We
illustrate them on the challenging task of keypoint localization under
occlusions, where local bottom-up evidence may be misleading. We demonstrate
state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201
Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge
This paper describes our submission to the 1st 3D Face Alignment in the Wild
(3DFAW) Challenge. Our method builds upon the idea of convolutional part
heatmap regression [1], extending it for 3D face alignment. Our method
decomposes the problem into two parts: (a) X,Y (2D) estimation and (b) Z
(depth) estimation. At the first stage, our method estimates the X,Y
coordinates of the facial landmarks by producing a set of 2D heatmaps, one for
each landmark, using convolutional part heatmap regression. Then, these
heatmaps, alongside the input RGB image, are used as input to a very deep
subnetwork trained via residual learning for regressing the Z coordinate. Our
method ranked 1st in the 3DFAW Challenge, surpassing the second best result by
more than 22%.Comment: Winner of 3D Face Alignment in the Wild (3DFAW) Challenge, ECCV 201
- …