9,848 research outputs found
Occlusion Coherence: Detecting and Localizing Occluded Faces
The presence of occluders significantly impacts object recognition accuracy.
However, occlusion is typically treated as an unstructured source of noise and
explicit models for occluders have lagged behind those for object appearance
and shape. In this paper we describe a hierarchical deformable part model for
face detection and landmark localization that explicitly models part occlusion.
The proposed model structure makes it possible to augment positive training
data with large numbers of synthetically occluded instances. This allows us to
easily incorporate the statistics of occlusion patterns in a discriminatively
trained model. We test the model on several benchmarks for landmark
localization and detection including challenging new data sets featuring
significant occlusion. We find that the addition of an explicit occlusion model
yields a detection system that outperforms existing approaches for occluded
instances while maintaining competitive accuracy in detection and landmark
localization for unoccluded instances
Deformable Part-based Fully Convolutional Network for Object Detection
Existing region-based object detectors are limited to regions with fixed box
geometry to represent objects, even if those are highly non-rectangular. In
this paper we introduce DP-FCN, a deep model for object detection which
explicitly adapts to shapes of objects with deformable parts. Without
additional annotations, it learns to focus on discriminative elements and to
align them, and simultaneously brings more invariance for classification and
geometric information to refine localization. DP-FCN is composed of three main
modules: a Fully Convolutional Network to efficiently maintain spatial
resolution, a deformable part-based RoI pooling layer to optimize positions of
parts and build invariance, and a deformation-aware localization module
explicitly exploiting displacements of parts to improve accuracy of bounding
box regression. We experimentally validate our model and show significant
gains. DP-FCN achieves state-of-the-art performances of 83.1% and 80.9% on
PASCAL VOC 2007 and 2012 with VOC data only.Comment: Accepted to BMVC 2017 (oral
Broadcasting Convolutional Network for Visual Relational Reasoning
In this paper, we propose the Broadcasting Convolutional Network (BCN) that
extracts key object features from the global field of an entire input image and
recognizes their relationship with local features. BCN is a simple network
module that collects effective spatial features, embeds location information
and broadcasts them to the entire feature maps. We further introduce the
Multi-Relational Network (multiRN) that improves the existing Relation Network
(RN) by utilizing the BCN module. In pixel-based relation reasoning problems,
with the help of BCN, multiRN extends the concept of `pairwise relations' in
conventional RNs to `multiwise relations' by relating each object with multiple
objects at once. This yields in O(n) complexity for n objects, which is a vast
computational gain from RNs that take O(n^2). Through experiments, multiRN has
achieved a state-of-the-art performance on CLEVR dataset, which proves the
usability of BCN on relation reasoning problems.Comment: Accepted paper at ECCV 2018. 24 page
- …