2,953 research outputs found
Deformable Convolutional Networks
Convolutional neural networks (CNNs) are inherently limited to model
geometric transformations due to the fixed geometric structures in its building
modules. In this work, we introduce two new modules to enhance the
transformation modeling capacity of CNNs, namely, deformable convolution and
deformable RoI pooling. Both are based on the idea of augmenting the spatial
sampling locations in the modules with additional offsets and learning the
offsets from target tasks, without additional supervision. The new modules can
readily replace their plain counterparts in existing CNNs and can be easily
trained end-to-end by standard back-propagation, giving rise to deformable
convolutional networks. Extensive experiments validate the effectiveness of our
approach on sophisticated vision tasks of object detection and semantic
segmentation. The code would be released
3D Object Class Detection in the Wild
Object class detection has been a synonym for 2D bounding box localization
for the longest time, fueled by the success of powerful statistical learning
techniques, combined with robust image representations. Only recently, there
has been a growing interest in revisiting the promise of computer vision from
the early days: to precisely delineate the contents of a visual scene, object
by object, in 3D. In this paper, we draw from recent advances in object
detection and 2D-3D object lifting in order to design an object class detector
that is particularly tailored towards 3D object class detection. Our 3D object
class detection method consists of several stages gradually enriching the
object detection output with object viewpoint, keypoints and 3D shape
estimates. Following careful design, in each stage it constantly improves the
performance and achieves state-ofthe-art performance in simultaneous 2D
bounding box and viewpoint estimation on the challenging Pascal3D+ dataset
- …