Search CORE

3 research outputs found

DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Author: Cheng Ming-Ming
Hou Qibin
Li Zhongyu
Liu Li
Yin Bowen
Zhang Xuying
Publication venue
Publication date: 07/02/2024
Field of study

We present DFormer, a novel RGB-D pretraining framework to learn transferable representations for RGB-D segmentation tasks. DFormer has two new key innovations: 1) Unlike previous works that encode RGB-D information with RGB pretrained backbone, we pretrain the backbone using image-depth pairs from ImageNet-1K, and hence the DFormer is endowed with the capacity to encode RGB-D representations; 2) DFormer comprises a sequence of RGB-D blocks, which are tailored for encoding both RGB and depth information through a novel building block design. DFormer avoids the mismatched encoding of the 3D geometry relationships in depth maps by RGB pretrained backbones, which widely lies in existing methods but has not been resolved. We finetune the pretrained DFormer on two popular RGB-D tasks, i.e., RGB-D semantic segmentation and RGB-D salient object detection, with a lightweight decoder head. Experimental results show that our DFormer achieves new state-of-the-art performance on these two tasks with less than half of the computational cost of the current best methods on two RGB-D semantic segmentation datasets and five RGB-D salient object detection datasets. Our code is available at: https://github.com/VCIP-RGBD/DFormer.Comment: Accepted by ICLR 202

arXiv.org e-Print Archive

Semantic Segmentation to Develop an Indoor Navigation System for an Autonomous Mobile Robot

Author: Fernández Gámiz Unai
Sáenz Aguirre Aitor
Sánchez Chica Ander
Teso Fernández de Betoño Daniel
Zulueta Guerrero Ekaitz
Publication venue: 'MDPI AG'
Publication date: 25/05/2020
Field of study

In this study, a semantic segmentation network is presented to develop an indoor navigation system for a mobile robot. Semantic segmentation can be applied by adopting different techniques, such as a convolutional neural network (CNN). However, in the present work, a residual neural network is implemented by engaging in ResNet-18 transfer learning to distinguish between the floor, which is the navigation free space, and the walls, which are the obstacles. After the learning process, the semantic segmentation floor mask is used to implement indoor navigation and motion calculations for the autonomous mobile robot. This motion calculations are based on how much the estimated path differs from the center vertical line. The highest point is used to move the motors toward that direction. In this way, the robot can move in a real scenario by avoiding different obstacles. Finally, the results are collected by analyzing the motor duty cycle and the neural network execution time to review the robot’s performance. Moreover, a different net comparison is made to determine other architectures’ reaction times and accuracy values.This research was financed by the plant of Mercedes-Benz Vitoria through the PIF program to develop an intelligent production. Moreover, The Regional Development Agency of the Basque Country (SPRI) is gratefully acknowledged for their economic support through the research project “Motor de Accionamiento para Robot Guiado Automáticamente”, KK-2019/00099, Programa ELKARTEK

Archivo Digital para la Docencia y la Investigación