187 research outputs found
Multilinear Wavelets: A Statistical Shape Space for Human Faces
We present a statistical model for D human faces in varying expression,
which decomposes the surface of the face using a wavelet transform, and learns
many localized, decorrelated multilinear models on the resulting coefficients.
Using this model we are able to reconstruct faces from noisy and occluded D
face scans, and facial motion sequences. Accurate reconstruction of face shape
is important for applications such as tele-presence and gaming. The localized
and multi-scale nature of our model allows for recovery of fine-scale detail
while retaining robustness to severe noise and occlusion, and is
computationally efficient and scalable. We validate these properties
experimentally on challenging data in the form of static scans and motion
sequences. We show that in comparison to a global multilinear model, our model
better preserves fine detail and is computationally faster, while in comparison
to a localized PCA model, our model better handles variation in expression, is
faster, and allows us to fix identity parameters for a given subject.Comment: 10 pages, 7 figures; accepted to ECCV 201
Grid Loss: Detecting Occluded Faces
Detection of partially occluded objects is a challenging computer vision
problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of
the detection window are occluded, since not every sub-part of the window is
discriminative on its own. To address this issue, we propose a novel loss layer
for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a
convolution layer independently rather than over the whole feature map. This
results in parts being more discriminative on their own, enabling the detector
to recover if the detection window is partially occluded. By mapping our loss
layer back to a regular fully connected layer, no additional computational cost
is incurred at runtime compared to standard CNNs. We demonstrate our method for
face detection on several public face detection benchmarks and show that our
method outperforms regular CNNs, is suitable for realtime applications and
achieves state-of-the-art performance.Comment: accepted to ECCV 201
Multi-View Priors for Learning Detectors from Sparse Viewpoint Data
While the majority of today's object class models provide only 2D bounding
boxes, far richer output hypotheses are desirable including viewpoint,
fine-grained category, and 3D geometry estimate. However, models trained to
provide richer output require larger amounts of training data, preferably well
covering the relevant aspects such as viewpoint and fine-grained categories. In
this paper, we address this issue from the perspective of transfer learning,
and design an object class model that explicitly leverages correlations between
visual features. Specifically, our model represents prior distributions over
permissible multi-view detectors in a parametric way -- the priors are learned
once from training data of a source object class, and can later be used to
facilitate the learning of a detector for a target class. As we show in our
experiments, this transfer is not only beneficial for detectors based on
basic-level category representations, but also enables the robust learning of
detectors that represent classes at finer levels of granularity, where training
data is typically even scarcer and more unbalanced. As a result, we report
largely improved performance in simultaneous 2D object localization and
viewpoint estimation on a recent dataset of challenging street scenes.Comment: 13 pages, 7 figures, 4 tables, International Conference on Learning
Representations 201
PASCAL VOC Challenge “Lifetime Achievement ” Prize 2010
Outstanding Reviewer Award CVPR 201
- …