773 research outputs found
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
In this work, we tackle the problem of instance segmentation, the task of
simultaneously solving object detection and semantic segmentation. Towards this
goal, we present a model, called MaskLab, which produces three outputs: box
detection, semantic segmentation, and direction prediction. Building on top of
the Faster-RCNN object detector, the predicted boxes provide accurate
localization of object instances. Within each region of interest, MaskLab
performs foreground/background segmentation by combining semantic and direction
prediction. Semantic segmentation assists the model in distinguishing between
objects of different semantic classes including background, while the direction
prediction, estimating each pixel's direction towards its corresponding center,
allows separating instances of the same semantic class. Moreover, we explore
the effect of incorporating recent successful methods from both segmentation
and detection (i.e. atrous convolution and hypercolumn). Our proposed model is
evaluated on the COCO instance segmentation benchmark and shows comparable
performance with other state-of-art models.Comment: 10 pages including referenc
Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors
Visual localization is an attractive problem that estimates the camera
localization from database images based on the query image. It is a crucial
task for various applications, such as autonomous vehicles, assistive
navigation and augmented reality. The challenging issues of the task lie in
various appearance variations between query and database images, including
illumination variations, dynamic object variations and viewpoint variations. In
order to tackle those challenges, Panoramic Annular Localizer into which
panoramic annular lens and robust deep image descriptors are incorporated is
proposed in this paper. The panoramic annular images captured by the single
camera are processed and fed into the NetVLAD network to form the active deep
descriptor, and sequential matching is utilized to generate the localization
result. The experiments carried on the public datasets and in the field
illustrate the validation of the proposed system.Comment: Accepted by ITSC 201
- …