Search CORE

4 research outputs found

Learning Local Feature Descriptor with Motion Attribute for Vision-based Localization

Author: Li Jia
Li Mingyang
Song Yafei
Tian Yonghong
Zhu Di
Publication venue
Publication date: 06/08/2019
Field of study

In recent years, camera-based localization has been widely used for robotic applications, and most proposed algorithms rely on local features extracted from recorded images. For better performance, the features used for open-loop localization are required to be short-term globally static, and the ones used for re-localization or loop closure detection need to be long-term static. Therefore, the motion attribute of a local feature point could be exploited to improve localization performance, e.g., the feature points extracted from moving persons or vehicles can be excluded from these systems due to their unsteadiness. In this paper, we design a fully convolutional network (FCN), named MD-Net, to perform motion attribute estimation and feature description simultaneously. MD-Net has a shared backbone network to extract features from the input image and two network branches to complete each sub-task. With MD-Net, we can obtain the motion attribute while avoiding increasing much more computation. Experimental results demonstrate that the proposed method can learn distinct local feature descriptor along with motion attribute only using an FCN, by outperforming competing methods by a wide margin. We also show that the proposed algorithm can be integrated into a vision-based localization algorithm to improve estimation accuracy significantly.Comment: This paper will be presented on IROS1

arXiv.org e-Print Archive

Crossref

Bifurcated backbone strategy for RGB-D salient object detection

Author: Borji Ali
Fan Deng-Ping
Han Junwei
Shao Ling
Wang Liang
Yang Jufeng
Zhai Yingjie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/08/2021
Field of study

Multi-level feature fusion is a fundamental topic in computer vision. It has been exploited to detect, segment and classify objects at various scales. When multi-level features meet multi-modal cues, the optimal feature aggregation and multi-modal learning strategy become a hot potato. In this paper, we leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to devise a novel cascaded refinement network. In particular, first, we propose to regroup the multi-level features into teacher and student features using a bifurcated backbone strategy (BBS). Second, we introduce a depth-enhanced module (DEM) to excavate informative depth cues from the channel and spatial views. Then, RGB and depth modalities are fused in a complementary way. Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is simple, efficient, and backbone-independent. Extensive experiments show that BBS-Net significantly outperforms eighteen SOTA models on eight challenging datasets under five evaluation measures, demonstrating the superiority of our approach (

\sim 4 \%

improvement in S-measure

vs.

the top-ranked model: DMRA-iccv2019). In addition, we provide a comprehensive analysis on the generalization ability of different RGB-D datasets and provide a powerful training set for future research.Comment: A preliminary version of this work has been accepted in ECCV 202

arXiv.org e-Print Archive