1,832 research outputs found
Occlusion Coherence: Detecting and Localizing Occluded Faces
The presence of occluders significantly impacts object recognition accuracy.
However, occlusion is typically treated as an unstructured source of noise and
explicit models for occluders have lagged behind those for object appearance
and shape. In this paper we describe a hierarchical deformable part model for
face detection and landmark localization that explicitly models part occlusion.
The proposed model structure makes it possible to augment positive training
data with large numbers of synthetically occluded instances. This allows us to
easily incorporate the statistics of occlusion patterns in a discriminatively
trained model. We test the model on several benchmarks for landmark
localization and detection including challenging new data sets featuring
significant occlusion. We find that the addition of an explicit occlusion model
yields a detection system that outperforms existing approaches for occluded
instances while maintaining competitive accuracy in detection and landmark
localization for unoccluded instances
Hashmod: A Hashing Method for Scalable 3D Object Detection
We present a scalable method for detecting objects and estimating their 3D
poses in RGB-D data. To this end, we rely on an efficient representation of
object views and employ hashing techniques to match these views against the
input frame in a scalable way. While a similar approach already exists for 2D
detection, we show how to extend it to estimate the 3D pose of the detected
objects. In particular, we explore different hashing strategies and identify
the one which is more suitable to our problem. We show empirically that the
complexity of our method is sublinear with the number of objects and we enable
detection and pose estimation of many 3D objects with high accuracy while
outperforming the state-of-the-art in terms of runtime.Comment: BMVC 201
Spotlight the Negatives: A Generalized Discriminative Latent Model
Discriminative latent variable models (LVM) are frequently applied to various
visual recognition tasks. In these systems the latent (hidden) variables
provide a formalism for modeling structured variation of visual features.
Conventionally, latent variables are de- fined on the variation of the
foreground (positive) class. In this work we augment LVMs to include negative
latent variables corresponding to the background class. We formalize the
scoring function of such a generalized LVM (GLVM). Then we discuss a framework
for learning a model based on the GLVM scoring function. We theoretically
showcase how some of the current visual recognition methods can benefit from
this generalization. Finally, we experiment on a generalized form of Deformable
Part Models with negative latent variables and show significant improvements on
two different detection tasks.Comment: Published in proceedings of BMVC 201
Grid Loss: Detecting Occluded Faces
Detection of partially occluded objects is a challenging computer vision
problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of
the detection window are occluded, since not every sub-part of the window is
discriminative on its own. To address this issue, we propose a novel loss layer
for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a
convolution layer independently rather than over the whole feature map. This
results in parts being more discriminative on their own, enabling the detector
to recover if the detection window is partially occluded. By mapping our loss
layer back to a regular fully connected layer, no additional computational cost
is incurred at runtime compared to standard CNNs. We demonstrate our method for
face detection on several public face detection benchmarks and show that our
method outperforms regular CNNs, is suitable for realtime applications and
achieves state-of-the-art performance.Comment: accepted to ECCV 201
- …