9,123 research outputs found
Occlusion Coherence: Detecting and Localizing Occluded Faces
The presence of occluders significantly impacts object recognition accuracy.
However, occlusion is typically treated as an unstructured source of noise and
explicit models for occluders have lagged behind those for object appearance
and shape. In this paper we describe a hierarchical deformable part model for
face detection and landmark localization that explicitly models part occlusion.
The proposed model structure makes it possible to augment positive training
data with large numbers of synthetically occluded instances. This allows us to
easily incorporate the statistics of occlusion patterns in a discriminatively
trained model. We test the model on several benchmarks for landmark
localization and detection including challenging new data sets featuring
significant occlusion. We find that the addition of an explicit occlusion model
yields a detection system that outperforms existing approaches for occluded
instances while maintaining competitive accuracy in detection and landmark
localization for unoccluded instances
Deep Regionlets for Object Detection
In this paper, we propose a novel object detection framework named "Deep
Regionlets" by establishing a bridge between deep neural networks and
conventional detection schema for accurate generic object detection. Motivated
by the abilities of regionlets for modeling object deformation and multiple
aspect ratios, we incorporate regionlets into an end-to-end trainable deep
learning framework. The deep regionlets framework consists of a region
selection network and a deep regionlet learning module. Specifically, given a
detection bounding box proposal, the region selection network provides guidance
on where to select regions to learn the features from. The regionlet learning
module focuses on local feature selection and transformation to alleviate local
variations. To this end, we first realize non-rectangular region selection
within the detection framework to accommodate variations in object appearance.
Moreover, we design a "gating network" within the regionlet leaning module to
enable soft regionlet selection and pooling. The Deep Regionlets framework is
trained end-to-end without additional efforts. We perform ablation studies and
conduct extensive experiments on the PASCAL VOC and Microsoft COCO datasets.
The proposed framework outperforms state-of-the-art algorithms, such as
RetinaNet and Mask R-CNN, even without additional segmentation labels.Comment: Accepted to ECCV 201
Learning to Reconstruct Texture-less Deformable Surfaces from a Single View
Recent years have seen the development of mature solutions for reconstructing
deformable surfaces from a single image, provided that they are relatively
well-textured. By contrast, recovering the 3D shape of texture-less surfaces
remains an open problem, and essentially relates to Shape-from-Shading. In this
paper, we introduce a data-driven approach to this problem. We introduce a
general framework that can predict diverse 3D representations, such as meshes,
normals, and depth maps. Our experiments show that meshes are ill-suited to
handle texture-less 3D reconstruction in our context. Furthermore, we
demonstrate that our approach generalizes well to unseen objects, and that it
yields higher-quality reconstructions than a state-of-the-art SfS technique,
particularly in terms of normal estimates. Our reconstructions accurately model
the fine details of the surfaces, such as the creases of a T-Shirt worn by a
person.Comment: Accepted to 3DV 201
Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation
This paper proposes a new hybrid architecture that consists of a deep
Convolutional Network and a Markov Random Field. We show how this architecture
is successfully applied to the challenging problem of articulated human pose
estimation in monocular images. The architecture can exploit structural domain
constraints such as geometric relationships between body joint locations. We
show that joint training of these two model paradigms improves performance and
allows us to significantly outperform existing state-of-the-art techniques
- …