14 research outputs found
Bifurcated backbone strategy for RGB-D salient object detection
Multi-level feature fusion is a fundamental topic in computer vision. It has
been exploited to detect, segment and classify objects at various scales. When
multi-level features meet multi-modal cues, the optimal feature aggregation and
multi-modal learning strategy become a hot potato. In this paper, we leverage
the inherent multi-modal and multi-level nature of RGB-D salient object
detection to devise a novel cascaded refinement network. In particular, first,
we propose to regroup the multi-level features into teacher and student
features using a bifurcated backbone strategy (BBS). Second, we introduce a
depth-enhanced module (DEM) to excavate informative depth cues from the channel
and spatial views. Then, RGB and depth modalities are fused in a complementary
way. Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is
simple, efficient, and backbone-independent. Extensive experiments show that
BBS-Net significantly outperforms eighteen SOTA models on eight challenging
datasets under five evaluation measures, demonstrating the superiority of our
approach ( improvement in S-measure the top-ranked model:
DMRA-iccv2019). In addition, we provide a comprehensive analysis on the
generalization ability of different RGB-D datasets and provide a powerful
training set for future research.Comment: A preliminary version of this work has been accepted in ECCV 202