6,976 research outputs found
OVSNet : Towards One-Pass Real-Time Video Object Segmentation
Video object segmentation aims at accurately segmenting the target object
regions across consecutive frames. It is technically challenging for coping
with complicated factors (e.g., shape deformations, occlusion and out of the
lens). Recent approaches have largely solved them by using backforth
re-identification and bi-directional mask propagation. However, their methods
are extremely slow and only support offline inference, which in principle
cannot be applied in real time. Motivated by this observation, we propose a
efficient detection-based paradigm for video object segmentation. We propose an
unified One-Pass Video Segmentation framework (OVS-Net) for modeling
spatial-temporal representation in a unified pipeline, which seamlessly
integrates object detection, object segmentation, and object re-identification.
The proposed framework lends itself to one-pass inference that effectively and
efficiently performs video object segmentation. Moreover, we propose a
maskguided attention module for modeling the multi-scale object boundary and
multi-level feature fusion. Experiments on the challenging DAVIS 2017
demonstrate the effectiveness of the proposed framework with comparable
performance to the state-of-the-art, and the great efficiency about 11.5 FPS
towards pioneering real-time work to our knowledge, more than 5 times faster
than other state-of-the-art methods.Comment: 10 pages, 6 figure
Single-Shot Refinement Neural Network for Object Detection
For object detection, the two-stage approach (e.g., Faster R-CNN) has been
achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has
the advantage of high efficiency. To inherit the merits of both while
overcoming their disadvantages, in this paper, we propose a novel single-shot
based detector, called RefineDet, that achieves better accuracy than two-stage
methods and maintains comparable efficiency of one-stage methods. RefineDet
consists of two inter-connected modules, namely, the anchor refinement module
and the object detection module. Specifically, the former aims to (1) filter
out negative anchors to reduce search space for the classifier, and (2)
coarsely adjust the locations and sizes of anchors to provide better
initialization for the subsequent regressor. The latter module takes the
refined anchors as the input from the former to further improve the regression
and predict multi-class label. Meanwhile, we design a transfer connection block
to transfer the features in the anchor refinement module to predict locations,
sizes and class labels of objects in the object detection module. The
multi-task loss function enables us to train the whole network in an end-to-end
way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO
demonstrate that RefineDet achieves state-of-the-art detection accuracy with
high efficiency. Code is available at https://github.com/sfzhang15/RefineDetComment: 14 pages, 7 figures, 7 table
- …