23,637 research outputs found
Face Attribute Prediction Using Off-the-Shelf CNN Features
Predicting attributes from face images in the wild is a challenging computer
vision problem. To automatically describe face attributes from face containing
images, traditionally one needs to cascade three technical blocks --- face
localization, facial descriptor construction, and attribute classification ---
in a pipeline. As a typical classification problem, face attribute prediction
has been addressed using deep learning. Current state-of-the-art performance
was achieved by using two cascaded Convolutional Neural Networks (CNNs), which
were specifically trained to learn face localization and attribute description.
In this paper, we experiment with an alternative way of employing the power of
deep representations from CNNs. Combining with conventional face localization
techniques, we use off-the-shelf architectures trained for face recognition to
build facial descriptors. Recognizing that the describable face attributes are
diverse, our face descriptors are constructed from different levels of the CNNs
for different attributes to best facilitate face attribute prediction.
Experiments on two large datasets, LFWA and CelebA, show that our approach is
entirely comparable to the state-of-the-art. Our findings not only demonstrate
an efficient face attribute prediction approach, but also raise an important
question: how to leverage the power of off-the-shelf CNN representations for
novel tasks.Comment: In proceeding of 2016 International Conference on Biometrics (ICB
Leveraging Mid-Level Deep Representations For Predicting Face Attributes in the Wild
Predicting facial attributes from faces in the wild is very challenging due
to pose and lighting variations in the real world. The key to this problem is
to build proper feature representations to cope with these unfavourable
conditions. Given the success of Convolutional Neural Network (CNN) in image
classification, the high-level CNN feature, as an intuitive and reasonable
choice, has been widely utilized for this problem. In this paper, however, we
consider the mid-level CNN features as an alternative to the high-level ones
for attribute prediction. This is based on the observation that face attributes
are different: some of them are locally oriented while others are globally
defined. Our investigations reveal that the mid-level deep representations
outperform the prediction accuracy achieved by the (fine-tuned) high-level
abstractions. We empirically demonstrate that the midlevel representations
achieve state-of-the-art prediction performance on CelebA and LFWA datasets.
Our investigations also show that by utilizing the mid-level representations
one can employ a single deep network to achieve both face recognition and
attribute prediction.Comment: In proceedings of 2016 International Conference on Image Processing
(ICIP
Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres
Many computer vision challenges require continuous outputs, but tend to be
solved by discrete classification. The reason is classification's natural
containment within a probability -simplex, as defined by the popular softmax
activation function. Regular regression lacks such a closed geometry, leading
to unstable training and convergence to suboptimal local minima. Starting from
this insight we revisit regression in convolutional neural networks. We observe
many continuous output problems in computer vision are naturally contained in
closed geometrical manifolds, like the Euler angles in viewpoint estimation or
the normals in surface normal estimation. A natural framework for posing such
continuous output problems are -spheres, which are naturally closed
geometric manifolds defined in the space. By introducing a
spherical exponential mapping on -spheres at the regression output, we
obtain well-behaved gradients, leading to stable training. We show how our
spherical regression can be utilized for several computer vision challenges,
specifically viewpoint estimation, surface normal estimation and 3D rotation
estimation. For all these problems our experiments demonstrate the benefit of
spherical regression. All paper resources are available at
https://github.com/leoshine/Spherical_Regression.Comment: CVPR 2019 camera read
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
- …