4,613 research outputs found
MOON: A Mixed Objective Optimization Network for the Recognition of Facial Attributes
Attribute recognition, particularly facial, extracts many labels for each
image. While some multi-task vision problems can be decomposed into separate
tasks and stages, e.g., training independent models for each task, for a
growing set of problems joint optimization across all tasks has been shown to
improve performance. We show that for deep convolutional neural network (DCNN)
facial attribute extraction, multi-task optimization is better. Unfortunately,
it can be difficult to apply joint optimization to DCNNs when training data is
imbalanced, and re-balancing multi-label data directly is structurally
infeasible, since adding/removing data to balance one label will change the
sampling of the other labels. This paper addresses the multi-label imbalance
problem by introducing a novel mixed objective optimization network (MOON) with
a loss function that mixes multiple task objectives with domain adaptive
re-weighting of propagated loss. Experiments demonstrate that not only does
MOON advance the state of the art in facial attribute recognition, but it also
outperforms independently trained DCNNs using the same data. When using facial
attributes for the LFW face recognition task, we show that our balanced (domain
adapted) network outperforms the unbalanced trained network.Comment: Post-print of manuscript accepted to the European Conference on
Computer Vision (ECCV) 2016
http://link.springer.com/chapter/10.1007%2F978-3-319-46454-1_
Joint Prediction of Depths, Normals and Surface Curvature from RGB Images using CNNs
Understanding the 3D structure of a scene is of vital importance, when it
comes to developing fully autonomous robots. To this end, we present a novel
deep learning based framework that estimates depth, surface normals and surface
curvature by only using a single RGB image. To the best of our knowledge this
is the first work to estimate surface curvature from colour using a machine
learning approach. Additionally, we demonstrate that by tuning the network to
infer well designed features, such as surface curvature, we can achieve
improved performance at estimating depth and normals.This indicates that
network guidance is still a useful aspect of designing and training a neural
network. We run extensive experiments where the network is trained to infer
different tasks while the model capacity is kept constant resulting in
different feature maps based on the tasks at hand. We outperform the previous
state-of-the-art benchmarks which jointly estimate depths and surface normals
while predicting surface curvature in parallel
Multi-View Face Recognition From Single RGBD Models of the Faces
This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Self-Paced Multi-Task Learning
In this paper, we propose a novel multi-task learning (MTL) framework, called
Self-Paced Multi-Task Learning (SPMTL). Different from previous works treating
all tasks and instances equally when training, SPMTL attempts to jointly learn
the tasks by taking into consideration the complexities of both tasks and
instances. This is inspired by the cognitive process of human brain that often
learns from the easy to the hard. We construct a compact SPMTL formulation by
proposing a new task-oriented regularizer that can jointly prioritize the tasks
and the instances. Thus it can be interpreted as a self-paced learner for MTL.
A simple yet effective algorithm is designed for optimizing the proposed
objective function. An error bound for a simplified formulation is also
analyzed theoretically. Experimental results on toy and real-world datasets
demonstrate the effectiveness of the proposed approach, compared to the
state-of-the-art methods
- …